skip to main content


Search for: All records

Award ID contains: 1906541

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    Deep neural networks (DNNs) have recently gained unprecedented success in various domains. In resource-constrained systems, QoS-aware DNNs are designed to meet latency requirements of mission-critical deep learning applications. However, none of the existing DNNs have been designed to satisfy both latency and memory bounds simultaneously as specified by end-users in the resource-constrained systems. In this paper, we propose BLINKNET, a runtime system that is able to guarantee both latency and memory/storage bounds via efficient QoS-aware per-layer approximation. We implement BLINKNET in Apache TVM and evaluate it using Cifar10-quick and VGG network models. Our experimental results show that BLINKNET can meet the latency and memory requirements with 2% accuracy loss on average. 
    more » « less