- Award ID(s):
- 1816870
- PAR ID:
- 10194098
- Date Published:
- Journal Name:
- Custom Integrated Circuits Conference
- Page Range / eLocation ID:
- 1 to 4
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Deep learning that utilizes large-scale deep neural networks (DNNs) is effective in automatic high-level feature extraction but also computation and memory intensive. Constructing DNNs using block-circulant matrices can simultaneously achieve hardware acceleration and model compression while maintaining high accuracy. This paper proposes HSIM-DNN, an accurate hardware simulator on the C++ platform, to simulate the exact behavior of DNN hardware implementations and thereby facilitate the block-circulant matrix-based design of DNN training and inference procedures in hardware. Real FPGA implementations validate the simulator with various circulant block sizes and data bit lengths taking into account accuracy, compression ratio and power consumption, which provides excellent insights for hardware design.more » « less
-
o shift the computational burden from real-time to offline in delay-critical power systems applications, recent works entertain the idea of using a deep neural network (DNN) to predict the solutions of the AC optimal power flow (AC-OPF) once presented load demands. As network topologies may change, training this DNN in a sample-efficient manner becomes a necessity. To improve data efficiency, this work utilizes the fact OPF data are not simple training labels, but constitute the solutions of a parametric optimization problem. We thus advocate training a sensitivity-informed DNN (SI-DNN) to match not only the OPF optimizers, but also their partial derivatives with respect to the OPF parameters (loads). It is shown that the required Jacobian matrices do exist under mild conditions, and can be readily computed from the related primal/dual solutions. The proposed SI-DNN is compatible with a broad range of OPF solvers, including a non-convex quadratically constrained quadratic program (QCQP), its semidefinite program (SDP) relaxation, and MATPOWER; while SI-DNN can be seamlessly integrated in other learning-to-OPF schemes. Numerical tests on three benchmark power systems corroborate the advanced generalization and constraint satisfaction capabilities for the OPF solutions predicted by an SI-DNN over a conventionally trained DNN, especially in low-data setups.more » « less
-
null (Ed.)With the growing performance and wide application of deep neural networks (DNNs), recent years have seen enormous efforts on DNN accelerator hardware design for platforms from mobile devices to data centers. The systolic array has been a popular architectural choice for many proposed DNN accelerators with hundreds to thousands of processing elements (PEs) for parallel computing. Systolic array-based DNN accelerators for datacenter applications have high power consumption and nonuniform workload distribution, which makes power delivery network (PDN) design challenging. Server-class multicore processors have benefited from distributed on-chip voltage regulation and heterogeneous voltage regulation (HVR) for improving energy efficiency while guaranteeing power delivery integrity. This paper presents the first work on HVR-based PDN architecture and control for systolic array-based DNN accelerators. We propose to employ a PDN architecture comprising heterogeneous on-chip and off-chip voltage regulators and multiple power domains. By analyzing patterns of typical DNN workloads via a modeling framework, we propose a DNN workload-aware dynamic PDN control policy to maximize system energy efficiency while ensuring power integrity. We demonstrate significant energy efficiency improvements brought by the proposed PDN architecture, dynamic control, and power gating, which lead to a more than five-fold reduction of leakage energy and PDN energy overhead for systolic array DNN accelerators.more » « less
-
We report a power-efficient analog front-end integrated circuit (IC) for multi-channel, dual-band subcortical recordings. In order to achieve high-resolution multi-channel recordings with low power consumption, we implemented an incremental ΔΣ ADC (IADC) with a dynamic zoom-and-track scheme. This scheme continuously tracks local field potential (LFP) and adaptively adjusts the input dynamic range (DR) into a zoomed sub-LFP range to resolve tiny action potentials. Thanks to the reduced DR, the oversampling rate of the IADC can be reduced by 64.3% compared to the conventional approach, leading to significant power reduction. In addition, dual-band recording can be easily attained because the scheme continuously tracks LFPs without additional on-chip hardware. A prototype four-channel front-end IC has been fabricated in 180 nm standard CMOS processes. The IADC achieved 11.3-bit ENOB at 6.8 μW, resulting in the best Walden and SNDR FoMs, 107.9 fJ/c-s and 162.1 dB, respectively, among two different comparison groups: the IADCs reported up to date in the state-of-the-art neural recording front-ends; and the recent brain recording ADCs using similar zooming or tracking techniques to this work. The intrinsic dual-band recording feature reduces the post-processing FPGA resources for subcortical signal band separation by >45.8%. The front-end IC with the zoom-and-track IADC showed an NEF of 5.9 with input-referred noise of 8.2 μVrms, sufficient for subcortical recording. The performance of the whole front-end IC was successfully validated through in vivo animal experiments.more » « less
-
Analog compute‐in‐memory (CIM) systems are promising candidates for deep neural network (DNN) inference acceleration. However, as the use of DNNs expands, protecting user input privacy has become increasingly important. Herein, a potential security vulnerability is identified wherein an adversary can reconstruct the user's private input data from a power side‐channel attack even without knowledge of the stored DNN model. An attack approach using a generative adversarial network is developed to achieve high‐quality data reconstruction from power leakage measurements. The analyses show that the attack methodology is effective in reconstructing user input data from power leakage of the analog CIM accelerator, even at large noise levels and after countermeasures. To demonstrate the efficacy of the proposed approach, an example of CIM inference of U‐Net for brain tumor detection is attacked, and the original magnetic resonance imaging medical images can be successfully reconstructed even at a noise level of 20% standard deviation of the maximum power signal value. This study highlights a potential security vulnerability in emerging analog CIM accelerators and raises awareness of needed safety features to protect user privacy in such systems.