skip to main content


Search for: All records

Creators/Authors contains: "Li, Shiyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 1, 2024
  2. Free, publicly-accessible full text available March 1, 2024
  3. Processing-in-memory (PIM) based architecture shows great potential to process several emerging artificial intelligence workloads, including vision and language models. Cross-layer optimizations could bridge the gap between computing density and the available resources by reducing the computation and memory cost of the model and improving the model’s robustness against non-ideal hardware effects. We first introduce several hardware-aware training methods to improve the model robustness to the PIM device’s nonideal effects, including stuck-at-fault, process variation, and thermal noise. Then, we further demonstrate a software/hardware (SW/HW) co-design methodology to efficiently process the state-of-the-art attention-based model on PIM-based architecture by performing sparsity exploration for the attention-based model and circuit architecture co-design to support the sparse processing. 
    more » « less
  4. Abstract Metalenses—flat lenses made with optical metasurfaces—promise to enable thinner, cheaper, and better imaging systems. Achieving a sufficient angular field of view (FOV) is crucial toward that goal and requires a tailored incident-angle-dependent response. Here, we show that there is an intrinsic trade-off between achieving a desired broad-angle response and reducing the thickness of the device. Like the memory effect in disordered media, this thickness bound originates from the Fourier transform duality between space and angle. One can write down the transmission matrix describing the desired angle-dependent response, convert it to the spatial basis where its degree of nonlocality can be quantified through a lateral spreading, and determine the minimal device thickness based on such a required lateral spreading. This approach is general. When applied to wide-FOV lenses, it predicts the minimal thickness as a function of the FOV, lens diameter, and numerical aperture. The bound is tight, as some inverse-designed multi-layer metasurfaces can approach the minimal thickness we found. This work offers guidance for the design of nonlocal metasurfaces, proposes a new framework for establishing bounds, and reveals the relation between angular diversity and spatial footprint in multi-channel systems. 
    more » « less
  5. Abstract

    The rapidly advancing capabilities in nanophotonic design are enabling complex functionalities limited mainly by physical bounds. The efficiency of transmission is a major consideration, but its ultimate limit remains unknown for most systems. This study introduces a matrix formalism that puts a fundamental bound on the channel‐averaged transmission efficiency of any passive multi‐channel optical system based only on energy conservation and the desired functionality, independent of the interior structure and material composition. Applying this formalism to diffraction‐limited nonlocal metalenses with a wide field of view shows that the transmission efficiency must decrease with the numerical aperture for the commonly adopted designs with equal entrance and output aperture diameters. It also shows that reducing the size of the entrance aperture can raise the efficiency bound. This study reveals a fundamental limit on the transmission efficiency as well as provides guidance for the design of high‐efficiency multi‐channel optical systems.

     
    more » « less
  6. Accurate and efficient on-chip power modeling is crucial to runtime power, energy, and voltage management. Such power monitoring can be achieved by designing and integrating on-chip power meters (OPMs) into the target design. In this work, we propose a new method named DEEP to automatically develop extremely efficient OPM solutions for a given design. DEEP selects OPM inputs from all individual bits in RTL signals. Such bit-level selection provides an unprecedentedly large number of input candidates and supports lower hardware cost, compared with signal-level selection in prior works. In addition, DEEP proposes a powerful two-step OPM input selection method, and it supports reporting both total power and the power of major design components. Experiments on a commercial microprocessor demonstrate that DEEP’s OPM solution achieves correlation 𝑅 > 0.97 in per-cycle power prediction with an unprecedented low area overhead on hardware, i.e., < 0.1% of the microprocessor layout. This reduces the OPM hardware cost by 4 − 6× compared with the state-of-the-art solution. 
    more » « less
  7. We show that any aberration-free wide-field-of-view lens system must have a minimal thickness—depending on the field of view, lens diameter, and numerical aperture—that originates from the Fourier transform relation between space and angle. 
    more » « less