skip to main content


Search for: All records

Award ID contains: 1845469

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Binary neural networks (BNNs) substitute complex arithmetic operations with simple bit-wise operations. The binarized weights and activations in BNNs can drastically reduce memory requirement and energy consumption, making it attractive for edge ML applications with limited resources. However, the severe memory capacity and energy constraints of low-power edge devices call for further reduction of BNN models beyond binarization. Weight pruning is a proven solution for reducing the size of many neural network (NN) models, but the binary nature of BNN weights make it difficult to identify insignificant weights to remove. In this paper, we present a pruning method based on latent weight with layer-level pruning sensitivity analysis which reduces the over-parameterization of BNNs, allowing for accuracy gains while drastically reducing the model size. Our method advocates for a heuristics that distinguishes weights by their latent weights, a real-valued vector used to compute the pseudogradient during backpropagation. It is tested using three different convolutional NNs on the MNIST, CIFAR-10, and Imagenette datasets with results indicating a 33%--46% reduction in operation count, with no accuracy loss, improving upon previous works in accuracy, model size, and total operation count. 
    more » « less
  2. Artificial intelligence (AI) based wearable applications collect and process a significant amount of streaming sensor data. Transmitting the raw data to cloud processors wastes scarce energy and threatens user privacy. Wearable edge AI devices should ideally balance two competing requirements: (1) maximizing the energy efficiency using targeted hardware accelerators and (2) providing versatility using general-purpose cores to support arbitrary applications. To this end, we present an open-source domain-specific programmable system-on-chip (SoC) that combines a RISC-V core with a meticulously determined set of accelerators targeting wearable applications. We apply the proposed design method to design an FPGA prototype and six real-life use cases to demonstrate the efficacy of the proposed SoC. Thorough experimental evaluations show that the proposed SoC provides up to 9.1x faster execution and up to 8.9x higher energy efficiency than software implementations in FPGA while maintaining programmability. 
    more » « less
  3. Wireless connectivity is becoming common in increasingly diverse personal devices, enabling various interoperation- and Internet-based applications and services. More and more interconnected devices are simultaneously operated by a single user with short-lived connections, making usable device authentication methods imperative to ensure both high security and seamless user experience. Unfortunately, current authentication methods that heavily require human involvement, in addition to form factor and mobility constraints, make this balance hard to achieve, often forcing users to choose between security and convenience. In this work, we present a novel over-the-air device authentication scheme named AEROKEY that achieves both high security and high usability. With virtually no hardware overhead, AEROKEY leverages ubiquitously observable ambient electromagnetic radiation to autonomously generate spatiotemporally unique secret that can be derived only by devices that are closely located to each other. Devices can make use of this unique secret to form the basis of a symmetric key, making the authentication procedure more practical, secure and usable with no active human involvement. We propose and implement essential techniques to overcome challenges in realizing AEROKEY on low-cost microcontroller units, such as poor time synchronization, lack of precision analog front-end, and inconsistent sampling rates. Our real-world experiments demonstrate reliable authentication as well as its robustness against various realistic adversaries with low equal-error rates of 3.4% or less and usable authentication time of as low as 24 s. 
    more » « less
  4. null (Ed.)
    Iterative computing, where the output accuracy gradually improves over multiple iterations, enables dynamic reconfiguration of energy-quality trade-offs by adjusting the latency (i.e., number of iterations). In order to take full advantage of the dynamic reconfigurability of iterative computing hardware, an efficient method for determining the optimal latency is crucial. In this paper, we introduce an integer linear programming (ILP)-based scheduling method to determine the optimal latency of iterative computing hardware. We consider the input-dependence of output accuracy of approximate hardware using data-driven error modeling for accurate quality estimation. The proposed method finds optimal or near-optimal latency with a significant speedup compared to exhaustive search and decision tree-based optimization. 
    more » « less