skip to main content


Search for: All records

Creators/Authors contains: "Wu, Yuting"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The constant drive to achieve higher performance in deep neural networks (DNNs) has led to the proliferation of very large models. Model training, however, requires intensive computation time and energy. Memristor‐based compute‐in‐memory (CIM) modules can perform vector‐matrix multiplication (VMM) in place and in parallel, and have shown great promises in DNN inference applications. However, CIM‐based model training faces challenges due to non‐linear weight updates, device variations, and low‐precision. In this work, a mixed‐precision training scheme is experimentally implemented to mitigate these effects using a bulk‐switching memristor‐based CIM module. Low‐precision CIM modules are used to accelerate the expensive VMM operations, with high‐precision weight updates accumulated in digital units. Memristor devices are only changed when the accumulated weight update value exceeds a pre‐defined threshold. The proposed scheme is implemented with a system‐onchip of fully integrated analog CIM modules and digital sub‐systems, showing fast convergence of LeNet training to 97.73%. The efficacy of training larger models is evaluated using realistic hardware parameters and verifies that CIM modules can enable efficient mix‐precision DNN training with accuracy comparable to full‐precision software‐trained models. Additionally, models trained on chip are inherently robust to hardware variations, allowing direct mapping to CIM inference chips without additional re‐training.

     
    more » « less
  2. Abstract Reservoir computing (RC) offers efficient temporal data processing with a low training cost by separating recurrent neural networks into a fixed network with recurrent connections and a trainable linear network. The quality of the fixed network, called reservoir, is the most important factor that determines the performance of the RC system. In this paper, we investigate the influence of the hierarchical reservoir structure on the properties of the reservoir and the performance of the RC system. Analogous to deep neural networks, stacking sub-reservoirs in series is an efficient way to enhance the nonlinearity of data transformation to high-dimensional space and expand the diversity of temporal information captured by the reservoir. These deep reservoir systems offer better performance when compared to simply increasing the size of the reservoir or the number of sub-reservoirs. Low frequency components are mainly captured by the sub-reservoirs in later stage of the deep reservoir structure, similar to observations that more abstract information can be extracted by layers in the late stage of deep neural networks. When the total size of the reservoir is fixed, tradeoff between the number of sub-reservoirs and the size of each sub-reservoir needs to be carefully considered, due to the degraded ability of individual sub-reservoirs at small sizes. Improved performance of the deep reservoir structure alleviates the difficulty of implementing the RC system on hardware systems. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
  5. Analog compute‐in‐memory (CIM) systems are promising candidates for deep neural network (DNN) inference acceleration. However, as the use of DNNs expands, protecting user input privacy has become increasingly important. Herein, a potential security vulnerability is identified wherein an adversary can reconstruct the user's private input data from a power side‐channel attack even without knowledge of the stored DNN model. An attack approach using a generative adversarial network is developed to achieve high‐quality data reconstruction from power leakage measurements. The analyses show that the attack methodology is effective in reconstructing user input data from power leakage of the analog CIM accelerator, even at large noise levels and after countermeasures. To demonstrate the efficacy of the proposed approach, an example of CIM inference of U‐Net for brain tumor detection is attacked, and the original magnetic resonance imaging medical images can be successfully reconstructed even at a noise level of 20% standard deviation of the maximum power signal value. This study highlights a potential security vulnerability in emerging analog CIM accelerators and raises awareness of needed safety features to protect user privacy in such systems.

     
    more » « less
  6. Viral infections are a major global health issue, but no current method allows rapid, direct, and ultrasensitive quantification of intact viruses with the ability to inform infectivity, causing misdiagnoses and spread of the viruses. Here, we report a method for direct detection and differentiation of infectious from noninfectious human adenovirus and SARS-CoV-2, as well as from other virus types, without any sample pretreatment. DNA aptamers are selected from a DNA library to bind intact infectious, but not noninfectious, virus and then incorporated into a solid-state nanopore, which allows strong confinement of the virus to enhance sensitivity down to 1 pfu/ml for human adenovirus and 1 × 10 4 copies/ml for SARS-CoV-2. Applications of the aptamer-nanopore sensors in different types of water samples, saliva, and serum are demonstrated for both enveloped and nonenveloped viruses, making the sensor generally applicable for detecting these and other emerging viruses of environmental and public health concern. 
    more » « less
  7. Abstract

    Memristive devices have demonstrated rich switching behaviors that closely resemble synaptic functions and provide a building block to construct efficient neuromorphic systems. It is demonstrated that resistive switching effects are controlled not only by the external field, but also by the dynamics of various internal state variables that facilitate the ionic processes. The internal temperature, for example, works as a second‐state variable to regulate the ion motion and provides the internal timing mechanism for the native implementation of timing‐ and rate‐based learning rules such as spike timing dependent plasticity (STDP). In this work, it is shown that the 2nd state‐variable in a Ta2O5‐based memristor, its internal temperature, can be systematically engineered by adjusting the material properties and device structure, leading to tunable STDP characteristics with different time constants. When combined with an artificial post‐synaptic neuron, the 2nd‐order memristor synapses can spontaneously capture the temporal correlation in the input streaming events.

     
    more » « less
  8.  
    more » « less