skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Fully-integrated Gesture and Gait Processing SoC for Rehabilitation with ADC-less Mixed-signal Feature Extraction and Deep Neural Network for Classification and Online Training
An ultra-low-power gesture and gait classification SoC is presented for rehabilitation application featuring (1) mixed-signal feature extraction and integrated low-noise amplifier eliminating expensive ADC and digital feature extraction, (2) an integrated distributed deep neural network (DNN) ASIC supporting a scalable multi-chip neural network for sensor fusion with distortion resiliency for low-cost front end modules, (3) onchip learning of DNN engine allowing in-situ training of user specific operations. A 12-channel 65nm CMOS test chip was fabricated with 1μW power per channel, less than 3ms computation latency, on-chip training for user-specific DNN model and multi-chip networking capability.  more » « less
Award ID(s):
1816870
PAR ID:
10194098
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Custom Integrated Circuits Conference
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep learning that utilizes large-scale deep neural networks (DNNs) is effective in automatic high-level feature extraction but also computation and memory intensive. Constructing DNNs using block-circulant matrices can simultaneously achieve hardware acceleration and model compression while maintaining high accuracy. This paper proposes HSIM-DNN, an accurate hardware simulator on the C++ platform, to simulate the exact behavior of DNN hardware implementations and thereby facilitate the block-circulant matrix-based design of DNN training and inference procedures in hardware. Real FPGA implementations validate the simulator with various circulant block sizes and data bit lengths taking into account accuracy, compression ratio and power consumption, which provides excellent insights for hardware design. 
    more » « less
  2. Neuromorphic computing systems promise high energy efficiency and low latency. In particular, when integrated with neuromorphic sensors, they can be used to produce intelligent systems for a broad range of applications. An event‐based camera is such a neuromorphic sensor, inspired by the sparse and asynchronous spike representation of the biological visual system. However, processing the event data requires either using expensive feature descriptors to transform spikes into frames, or using spiking neural networks (SNNs) that are expensive to train. In this work, a neural network architecture is proposed, reservoir nodes‐enabled neuromorphic vision sensing network (RN‐Net), based on dynamic temporal encoding by on‐sensor reservoirs and simple deep neural network (DNN) blocks. The reservoir nodes enable efficient temporal processing of asynchronous events by leveraging the native dynamics of the node devices, while the DNN blocks enable spatial feature processing. Combining these blocks in a hierarchical structure, the RN‐Net offers efficient processing for both local and global spatiotemporal features. RN‐Net executes dynamic vision tasks created by event‐based cameras at the highest accuracy reported to date at one order of magnitude smaller network size. The use of simple DNN and standard backpropagation‐based training rules further reduces implementation and training costs. 
    more » « less
  3. o shift the computational burden from real-time to offline in delay-critical power systems applications, recent works entertain the idea of using a deep neural network (DNN) to predict the solutions of the AC optimal power flow (AC-OPF) once presented load demands. As network topologies may change, training this DNN in a sample-efficient manner becomes a necessity. To improve data efficiency, this work utilizes the fact OPF data are not simple training labels, but constitute the solutions of a parametric optimization problem. We thus advocate training a sensitivity-informed DNN (SI-DNN) to match not only the OPF optimizers, but also their partial derivatives with respect to the OPF parameters (loads). It is shown that the required Jacobian matrices do exist under mild conditions, and can be readily computed from the related primal/dual solutions. The proposed SI-DNN is compatible with a broad range of OPF solvers, including a non-convex quadratically constrained quadratic program (QCQP), its semidefinite program (SDP) relaxation, and MATPOWER; while SI-DNN can be seamlessly integrated in other learning-to-OPF schemes. Numerical tests on three benchmark power systems corroborate the advanced generalization and constraint satisfaction capabilities for the OPF solutions predicted by an SI-DNN over a conventionally trained DNN, especially in low-data setups. 
    more » « less
  4. We propose a novel graph neural network (GNN) architecture for jointly optimizing user association, base station (BS) beamforming, and reconfigurable intelligent surface (RIS) phase shift in a multi-RIS aided multi-cell network. The proposed architecture represents BSs and users as nodes in a bipartite graph where the same type of nodes shares the same neural networks for generating messages and updating its representations, allowing for distributed implementation. In addition, we utilize a composite reflected channel estimation integrated between layers of the GNN structure to significantly reduce the signaling overhead and complexity required for channel estimation in a multi-RIS network. To avoid BS overload, load balancing is regularized in the training of the GNN and we further develop a collision avoidance algorithm to ensure strict load balancing at every BS. Numerical results show that the proposed GNN architecture is significantly more efficient than existing approaches. The results further demonstrate its strong scalability with network size and achieving a throughput performance approaching that of a centralized traditional optimization algorithm, without requiring individual RIS-reflected channels estimation and without the need for re-training or fine-tuning. 
    more » « less
  5. We report a power-efficient analog front-end integrated circuit (IC) for multi-channel, dual-band subcortical recordings. In order to achieve high-resolution multi-channel recordings with low power consumption, we implemented an incremental ΔΣ ADC (IADC) with a dynamic zoom-and-track scheme. This scheme continuously tracks local field potential (LFP) and adaptively adjusts the input dynamic range (DR) into a zoomed sub-LFP range to resolve tiny action potentials. Thanks to the reduced DR, the oversampling rate of the IADC can be reduced by 64.3% compared to the conventional approach, leading to significant power reduction. In addition, dual-band recording can be easily attained because the scheme continuously tracks LFPs without additional on-chip hardware. A prototype four-channel front-end IC has been fabricated in 180 nm standard CMOS processes. The IADC achieved 11.3-bit ENOB at 6.8 μW, resulting in the best Walden and SNDR FoMs, 107.9 fJ/c-s and 162.1 dB, respectively, among two different comparison groups: the IADCs reported up to date in the state-of-the-art neural recording front-ends; and the recent brain recording ADCs using similar zooming or tracking techniques to this work. The intrinsic dual-band recording feature reduces the post-processing FPGA resources for subcortical signal band separation by >45.8%. The front-end IC with the zoom-and-track IADC showed an NEF of 5.9 with input-referred noise of 8.2 μVrms, sufficient for subcortical recording. The performance of the whole front-end IC was successfully validated through in vivo animal experiments. 
    more » « less