skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 16 until 2:00 AM ET on Saturday, May 17 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Zhao, Hui"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As the Next-Generation Sequencing (NGS) techniques need to process enormous amounts of data, cost-efficientfand high-throughput computational analysis is essential in genomicsfstudy. Conventional computing platforms face great challenges to meet these demands due to their limited processing speed and scalability. Hardware accelerators, such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application-Specific Integrated Circuits (ASICs), offer transformative solutions to these computational challenges. This paper provides a state-of-the-art review of the roles of hardware accelerators in genomic analysis.We performed a comprehensive and in-depth analysis of cutting-edge genomics hardware accelerators, such as GPUs, FPGAs, and ASICs, in the context of the specific algorithms they aim to enhance. Besides reviewing opportunities in hardware genome acceleration, we also provide insights into the challenges regarding processing speed, cost efficiency, and scalability. 
    more » « less
    Free, publicly-accessible full text available December 16, 2025
  2. In recent years, Network-on-Chip (NoC) has emerged as a promising solution for addressing a critical performance bottleneck encountered in designing large-scale multi-core systems, i.e., data communication. With advancements in chip manufacturing technologies and the increasing complexity of system designs, the task of designing the communication sub- systems has become increasingly challenging. The emergence of hardware accelerators, such as GPUs, FPGAs and ASICs, together with heterogeneous system integration of the CPUs and the accelerators creates new challenges in NoC design. Conventional NoC architectures developed for CPU-based multi- core systems are not able to satisfy the traffic demands of heterogeneous systems. In recent years, numerous research efforts have been dedicated to exploring the various aspects of NoC design in hardware accelerators and heterogeneous systems. However, there is a need for a comprehensive understanding of the current state-of-the-art research in this emerging research area. This paper aims to provide a summary of research work conducted in heterogeneous NoC design. Through this survey, we aim to present a comprehensive overview of the current related research, highlighting key findings, challenges, and future directions in this field. 
    more » « less
    Free, publicly-accessible full text available December 16, 2025
  3. Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer. However, CPU and GPU applications have very different traffic patterns in general. Without effective management of the network resource, some chiplets can suffer significant performance degradation because the network bandwidth is taken away by communication-intensive applications. Therefore, techniques need to be developed to effectively manage the shared network resources. In a chiplet-based system, resource management needs to not only react in real-time but also be cost-efficient. In this work, we propose a reconfigurable network architecture, leveraging Kalman Filter to make accurate predictions on network resources needed by the applications and then adaptively change the resource allocation. Using our design, the network bandwidth can be fairly allocated to avoid starvation or performance degradation. Our evaluation results show that the proposed reconfigurable interconnection network can dynamically react to the changes in traffic demand of the chiplets and improve the system performance with low cost and design complexity. 
    more » « less
    Free, publicly-accessible full text available June 12, 2025
  4. Free, publicly-accessible full text available August 20, 2025
  5. Free, publicly-accessible full text available June 13, 2025
  6. The moiré potential in rotationally misfit two-dimensional (2D) heterostructures has been used to build artificial exciton and electron lattices, which have become platforms for realizing exotic electronic phases. Here, we demonstrate a different approach to create a superlattice potential in 2D crystals by using the near field of an array of polar molecules. A bilayer of titanyl phthalocyanine (TiOPc), consisting of alternating out-of-plane dipoles, is deposited on monolayer MoS2. Time-resolved two-photon photoemission spectroscopy reveals a pair of interlayer exciton states with an energy difference of ∼0.1 eV, which is consistent with the electrostatic potential modulation induced by the TiOPc bilayer as determined by density functional theory calculations. Because the symmetry and the period of this potential superlattice can be changed readily by using molecules of different shapes and sizes, molecule/2D heterostructures can be promising platforms for designing artificial exciton and electron lattices. 
    more » « less
    Free, publicly-accessible full text available August 21, 2025
  7. The effect of the energy valley on interlayer charge transfer in transition metal dichalcogenide (TMD) heterostructures is studied by transient absorption spectroscopy and density functional theory. First-principles calculations confirm that the Λmin valley in the conduction band of few-layer WSe2 evolves from above its K valley in the monolayer (1L) to below it in 4L. Heterostructure samples of 𝑛⁢L−WSe2/1⁢L−MoS2, where 𝑛=1,2,3, and 4, are obtained by mechanical exfoliation and dry transfer. Photoluminescence spectroscopy reveals a thickness-dependent WSe2 band structure and efficient interlayer charge transfer. Transient absorption measurements show that the electron transfer time from the Λmin valley of 4L WSe2 to the K valley of MoS2 is on the order of 30 ps. This process is much slower than the K-K charge transfer in 1L/1L TMD heterostructures. The momentum-indirect interlayer excitons formed after charge transfer have lifetimes >1 ns. 
    more » « less