skip to main content


Search for: All records

Creators/Authors contains: "Jiang, Lei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Trapped-Ion (TI) technology offers potential breakthroughs for Noisy Intermediate Scale Quantum (NISQ) computing. TI qubits offer extended coherence times and high gate fidelity, making them appealing for large-scale NISQ computers. Constructing such computers demands a distributed architecture connecting Quantum Charge Coupled Devices (QCCDs) via quantum matter-links and photonic switches. However, current distributed TI NISQ computers face hardware and system challenges. Entangling qubits across a photonic switch introduces significant latency, while existing compilers generate suboptimal mappings due to their unawareness of the interconnection topology. In this paper, we introduce TITAN, a large-scale distributed TI NISQ computer, which employs an innovative photonic interconnection design to reduce entanglement latency and an advanced partitioning and mapping algorithm to optimize matter-link communications. Our evaluations show that TITAN greatly enhances quantum application performance by 56.6% and fidelity by 19.7% compared to existing systems. 
    more » « less
    Free, publicly-accessible full text available May 1, 2025
  2. This paper presents \textit{OFHE}, an electro-optical accelerator designed to process Discretized TFHE (DTFHE) operations, which encrypt multi-bit messages and support homomorphic multiplications, lookup table operations and full-domain functional bootstrappings. While DTFHE is more efficient and versatile than other fully homomorphic encryption schemes, it requires 32-, 64-, and 128-bit polynomial multiplications, which can be time-consuming. Existing TFHE accelerators are not easily upgradable to support DTFHE operations due to limited datapaths, a lack of datapath bit-width reconfigurability, and power inefficiencies when processing FFT and inverse FFT (IFFT) kernels. Compared to prior TFHE accelerators, OFHE addresses these challenges by improving the DTFHE operation latency by 8.7\%, the DTFHE operation throughput by $57\%$, and the DTFHE operation throughput per Watt by $94\%$. 
    more » « less
    Free, publicly-accessible full text available May 1, 2025
  3. The carbon footprint associated with large language models (LLMs) is a significant concern, encompassing emissions from their training, inference, experimentation, and storage processes, including operational and embodied carbon emissions. An essential aspect is accurately estimating the carbon impact of emerging LLMs even before their training, which heavily relies on GPU usage. Existing studies have reported the carbon footprint of LLM training, but only one tool, mlco2, can predict the carbon footprint of new neural networks prior to physical training. However, mlco2 has several serious limitations. It cannot extend its estimation to dense or mixture-of-experts (MoE) LLMs, disregards critical architectural parameters, focuses solely on GPUs, and cannot model embodied carbon footprints. Addressing these gaps, we introduce \textit{\carb}, an end-to-end carbon footprint projection model designed for both dense and MoE LLMs. Compared to mlco2, \carb~significantly enhances the accuracy of carbon footprint estimations for various LLMs. The source code is released at \url{https://github.com/SotaroKaneda/MLCarbon}. 
    more » « less
    Free, publicly-accessible full text available April 1, 2025
  4. The evolution of oxygen cycles on Earth’s surface has been regulated by the balance between molecular oxygen production and consumption. The Neoproterozoic–Paleozoic transition likely marks the second rise in atmospheric and oceanic oxygen levels, widely attributed to enhanced burial of organic carbon. However, it remains disputed how marine organic carbon production and burial respond to global environmental changes and whether these feedbacks trigger global oxygenation during this interval. Here, we report a large lithium isotopic and elemental dataset from marine mudstones spanning the upper Neoproterozoic to middle Cambrian [~660 million years ago (Ma) to 500 Ma]. These data indicate a dramatic increase in continental clay formation after ~525 Ma, likely linked to secular changes in global climate and compositions of the continental crust. Using a global biogeochemical model, we suggest that intensified continental weathering and clay delivery to the oceans could have notably increased the burial efficiency of organic carbon and facilitated greater oxygen accumulation in the earliest Paleozoic oceans. 
    more » « less
    Free, publicly-accessible full text available March 29, 2025
  5. We propose a circuit-level backdoor attack, QTrojan, against Quantum Neural Networks (QNNs) in this paper. QTrojan is implemented by a few quantum gates inserted into the variational quantum circuit of the victim QNN. QTrojan is much stealthier than a prior Data-Poisoning-based Backdoor Attack (DPBA) since it does not embed any trigger in the inputs of the victim QNN or require access to original training datasets. Compared to a DPBA, QTrojan improves the clean data accuracy by 21% and the attack success rate by 19.9%. 
    more » « less
    Free, publicly-accessible full text available June 4, 2024
  6. The widespread use of machine learning is changing our daily lives. Unfortunately, clients are often concerned about the privacy of their data when using machine learning-based applications. To address these concerns, the development of privacy-preserving machine learning (PPML) is essential. One promising approach is the use of fully homomorphic encryption (FHE) based PPML, which enables services to be performed on encrypted data without decryption. Although the speed of computationally expensive FHE operations can be significantly boosted by prior ASIC-based FHE accelerators, the performance of key-switching, the dominate primitive in various FHE operations, is seriously limited by their small bit-width datapaths and frequent matrix transpositions. In this paper, we present an electro-optical (EO) PPML accelerator, PriML, to accelerate FHE operations. Its 512-bit datapath supporting 510-bit residues greatly reduces the key-switching cost. We also create an in-scratchpad-memory transpose unit to fast transpose matrices. Compared to prior PPML accelerators, on average, PriML reduces the latency of various machine learning applications by > 94.4% and the energy consumption by > 95%. 
    more » « less
  7. Abstract

    Predicting protein localization and understanding its mechanisms are critical in biology and pathology. In this context, we propose a new web application of MULocDeep with improved performance, result interpretation, and visualization. By transferring the original model into species-specific models, MULocDeep achieved competitive prediction performance at the subcellular level against other state-of-the-art methods. It uniquely provides a comprehensive localization prediction at the suborganellar level. Besides prediction, our web service quantifies the contribution of single amino acids to localization for individual proteins; for a group of proteins, common motifs or potential targeting-related regions can be derived. Furthermore, the visualizations of targeting mechanism analyses can be downloaded for publication-ready figures. The MULocDeep web service is available at https://www.mu-loc.org/.

     
    more » « less
  8. Fully Homomorphic Encryption over the Torus (TFHE) allows arbitrary computations to happen directly on ciphertexts using homomorphic logic gates. However, each TFHE gate on state-of-the-art hardware platforms such as GPUs and FPGAs is extremely slow (> 0.2ms). Moreover, even the latest FPGA-based TFHE accelerator cannot achieve high energy efficiency, since it frequently invokes expensive double-precision floating point FFT and IFFT kernels. In this paper, we propose a fast and energy-efficient accelerator, MATCHA, to process TFHE gates. MATCHA supports aggressive bootstrapping key unrolling to accelerate TFHE gates without decryption errors by approximate multiplication-less integer FFTs and IFFTs, and a pipelined datapath. Compared to prior accelerators, MATCHA improves the TFHE gate processing throughput by 2.3x, and the throughput per Watt by 6.3x. 
    more » « less
  9. Abstract

    Transdermal drug delivery provides convenient and pain-free self-administration for personalized therapy. However, challenges remain in treating acute diseases mainly due to their inability to timely administrate therapeutics and precisely regulate pharmacokinetics within a short time window. Here we report the development of active acoustic metamaterials-driven transdermal drug delivery for rapid and on-demand acute disease management. Through the integration of active acoustic metamaterials, a compact therapeutic patch is integrated for penetration of skin stratum corneum and active percutaneous transport of therapeutics with precise control of dose and rate over time. Moreover, the patch device quantitatively regulates the dosage and release kinetics of therapeutics and achieves better delivery performance in vivo than through subcutaneous injection. As a proof-of-concept application, we show our method can reverse life-threatening acute allergic reactions in a female mouse model of anaphylaxis via a multi-burst delivery of epinephrine, showing better efficacy than a fixed dosage injection of epinephrine, which is the current gold standard ‘self-injectable epinephrine’ strategy. This innovative method may provide a promising means to manage acute disease for personalized medicine.

     
    more » « less