skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: AnytimeNet: Controlling Time-Quality Tradeoffs in Deep Neural Network Architectures
Deeper neural networks, especially those with extremely large numbers of internal parameters, impose a heavy computational burden in obtaining sufficiently high-quality results. These burdens are impeding the application of machine learning and related techniques to time-critical computing systems. To address this challenge, we are proposing an architectural approach for neural networks that adaptively trades off computation time and solution quality to achieve high-quality solutions with timeliness. We propose a novel and general framework, AnytimeNet, that gradually inserts additional layers, so users can expect monotonically increasing quality of solutions as more computation time is expended. The framework allows users to select on the fly when to retrieve a result during runtime. Extensive evaluation results on classification tasks demonstrate that our proposed architecture provides adaptive control of classification solution quality according to the available computation time.  more » « less
Award ID(s):
1715154 1521523
PAR ID:
10194726
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition (DATE'20)
Page Range / eLocation ID:
945 to 950
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper we investigate image classification with computational resource lim- its at test time. Two such settings are: 1. anytime classification, where the net- work’s prediction for a test example is progressively updated, facilitating the out- put of a prediction at any time; and 2. budgeted batch classification, where a fixed amount of computation is available to classify a set of examples that can be spent unevenly across “easier” and “harder” inputs. In contrast to most prior work, such as the popular Viola and Jones algorithm, our approach is based on convolutional neural networks. We train multiple classifiers with varying resource demands, which we adaptively apply during test time. To maximally re-use computation between the classifiers, we incorporate them as early-exits into a single deep convolutional neural network and inter-connect them with dense connectivity. To facilitate high quality classification early on, we use a two-dimensional multi-scale network architecture that maintains coarse and fine level features all-throughout the network. Experiments on three image-classification tasks demonstrate that our framework substantially improves the existing state-of-the-art in both settings. 
    more » « less
  2. null (Ed.)
    We develop a convex analytic framework for ReLU neural networks which elucidates the inner workings of hidden neurons and their function space characteristics. We show that neural networks with rectified linear units act as convex regularizers, where simple solutions are encouraged via extreme points of a certain convex set. For one dimensional regression and classification, as well as rank-one data matrices, we prove that finite two-layer ReLU networks with norm regularization yield linear spline interpolation. We characterize the classification decision regions in terms of a closed form kernel matrix and minimum L1 norm solutions. This is in contrast to Neural Tangent Kernel which is unable to explain neural network predictions with finitely many neurons. Our convex geometric description also provides intuitive explanations of hidden neurons as auto encoders. In higher dimensions, we show that the training problem for two-layer networks can be cast as a finite dimensional convex optimization problem with infinitely many constraints. We then provide a family of convex relaxations to approximate the solution, and a cutting-plane algorithm to improve the relaxations. We derive conditions for the exactness of the relaxations and provide simple closed form formulas for the optimal neural network weights in certain cases. We also establish a connection to ℓ0-ℓ1 equivalence for neural networks analogous to the minimal cardinality solutions in compressed sensing. Extensive experimental results show that the proposed approach yields interpretable and accurate models. 
    more » « less
  3. Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices. To overcome the challenge and facilitate the real-time deployment of SISR tasks on mobile, we combine neural architecture search with pruning search and propose an automatic search framework that derives sparse super-resolution (SR) models with high image quality while satisfying the real-time inference requirement. To decrease the search cost, we leverage the weight sharing strategy by introducing a supernet and decouple the search problem into three stages, including supernet construction, compiler-aware architecture and pruning search, and compiler-aware pruning ratio search. With the proposed framework, we are the first to achieve real-time SR inference (with only tens of milliseconds per frame) for implementing 720p resolution with competitive image quality (in terms of PSNR and SSIM) on mobile platforms (Samsung Galaxy S20). 
    more » « less
  4. Individuals who are blind adopt multiple procedures to tactually explore images. Automatically recognizing and classifying users’ exploration behaviors is the first step towards the development of an intelligent system that could assist users to explore images more efficiently. In this paper, a computational framework was developed to classify different procedures used by blind users during image exploration. Translation-, rotationand scale-invariant features were extracted from the trajectories of users movements. These features were divided as numerical and logical features and were fed into neural networks. More specifically, we trained spiking neural networks (SNNs) to further encode the numerical features as model strings. The proposed framework employed a distance-based classification scheme to determine the final class/label of the exploratory procedures. Dempster-Shafter Theory (DST) was applied to integrate the distances obtained from all the features. Through the experiments of different dynamics of spiking neurons, the proposed framework achieved a good performance with 95.89% classification accuracy. It is extremely effective in encoding and classifying spatio-temporal data, as compared to Dynamic Time Warping and Hidden Markov Model with 61.30% and 28.70% accuracy. The proposed framework serves as the fundamental block for the development of intelligent interfaces, enhancing the image exploration experience for the blind. 
    more » « less
  5. In the past decade, Deep Neural Networks (DNNs), e.g., Convolutional Neural Networks, achieved human-level performance in vision tasks such as object classification and detection. However, DNNs are known to be computationally expensive and thus hard to be deployed in real-time and edge applications. Many previous works have focused on DNN model compression to obtain smaller parameter sizes and consequently, less computational cost. Such methods, however, often introduce noticeable accuracy degradation. In this work, we optimize a state-of-the-art DNN-based video detection framework—Deep Feature Flow (DFF) from the cloud end using three proposed ideas. First, we propose Asynchronous DFF (ADFF) to asynchronously execute the neural networks. Second, we propose a Video-based Dynamic Scheduling (VDS) method that decides the detection frequency based on the magnitude of movement between video frames. Last, we propose Spatial Sparsity Inference, which only performs the inference on part of the video frame and thus reduces the computation cost. According to our experimental results, ADFF can reduce the bottleneck latency from 89 to 19 ms. VDS increases the detection accuracy by 0.6% mAP without increasing computation cost. And SSI further saves 0.2 ms with a 0.6% mAP degradation of detection accuracy. 
    more » « less