skip to main content

Search for: All records

Creators/Authors contains: "Li, Ang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract The proliferation of Internet-of-Things has promoted a wide variety of emerging applications that require compact, lightweight, and low-cost optical spectrometers. While substantial progresses have been made in the miniaturization of spectrometers, most of them are with a major focus on the technical side but tend to feature a lower technology readiness level for manufacturability. More importantly, in spite of the advancement in miniaturized spectrometers, their performance and the metrics of real-life applications have seldomly been connected but are highly important. This review paper shows the market trend for chip-scale spectrometers and analyzes the key metrics that are required to adopt miniaturized spectrometers in real-life applications. Recent progress addressing the challenges of miniaturization of spectrometers is summarized, paying a special attention to the CMOS-compatible fabrication platform that shows a clear pathway to massive production. Insights for ways forward are also presented.
    Free, publicly-accessible full text available December 1, 2023
  2. Graph Neural Networks (GNNs) have drawn tremendous attention due to their unique capability to extend Machine Learning (ML) approaches to applications broadly-defined as having unstructured data, especially graphs. Compared with other Machine Learning (ML) modalities, the acceleration of Graph Neural Networks (GNNs) is more challenging due to the irregularity and heterogeneity derived from graph typologies. Existing efforts, however, have focused mainly on handling graphs’ irregularity and have not studied their heterogeneity. To this end we propose H-GCN, a PL (Programmable Logic) and AIE (AI Engine) based hybrid accelerator that leverages the emerging heterogeneity of Xilinx Versal Adaptive Compute Acceleration Platforms (ACAPs) to achieve high-performance GNN inference. In particular, H-GCN partitions each graph into three subgraphs based on its inherent heterogeneity, and processes them using PL and AIE, respectively. To further improve performance, we explore the sparsity support of AIE and develop an efficient density-aware method to automatically map tiles of sparse matrix-matrix multiplication (SpMM) onto the systolic tensor array. Compared with state-of-the-art GCN accelerators, H-GCN achieves, on average, speedups of 1.1∼2.3×.
    Free, publicly-accessible full text available August 29, 2023
  3. Passive RFID technology is widely used in user authentication and access control. We propose RF-Rhythm, a secure and usable two-factor RFID authentication system with strong resilience to lost/stolen/cloned RFID cards. In RF-Rhythm, each legitimate user performs a sequence of taps on his/her RFID card according to a self-chosen secret melody. Such rhythmic taps can induce phase changes in the backscattered signals, which the RFID reader can detect to recover the user’s tapping rhythm. In addition to verifying the RFID card’s identification information as usual, the backend server compares the extracted tapping rhythm with what it acquires in the user enrollment phase. The user passes authentication checks if and only if both verifications succeed. We also propose a novel phase-hopping protocol in which the RFID reader emits Continuous Wave (CW) with random phases for extracting the user’s secret tapping rhythm. Our protocol can prevent a capable adversary from extracting and then replaying a legitimate tapping rhythm from sniffed RFID signals. Comprehensive user experiments confirm the high security and usability of RF-Rhythm with false-positive and false-negative rates close to zero.
    Free, publicly-accessible full text available September 1, 2023
  4. As parallel computers continue to grow to exascale, the amount of data that needs to be saved or transmitted is exploding. To this end, many previous works have studied using error-bounded lossy compressors to reduce the data size and improve the I/O performance. However, little work has been done for effectively offloading lossy compression onto FPGA-based SmartNICs to reduce the compression overhead. In this paper, we propose a hardware-algorithm co-design for an efficient and adaptive lossy compressor for scientific data on FPGAs (called CEAZ), which is the first lossy compressor that can achieve high compression ratios and throughputs simultaneously. Specifically, we propose an efficient Huffman coding approach that can adaptively update Huffman codewords online based on codewords generated offline, from a variety of representative scientific datasets. Moreover, we derive a theoretical analysis to support a precise control of compression ratio under an error-bounded compression mode, enabling accurate offline Huffman codewords generation. This also helps us create a fixed-ratio compression mode for consistent throughput. In addition, we develop an efficient compression pipeline by adopting cuSZ's dual-quantization algorithm to our hardware use cases. Finally, we evaluate CEAZ on five real-world datasets with both a single FPGA board and 128 nodes (to acceleratemore »parallel I/O). Experiments show that CEAZ outperforms the second-best FPGA-based lossy compressor by 2X of throughput and 9.6X of ratio. It also improves MPI_File_write and MPI_Gather throughputs by up to 28.1X and 36.9X, respectively.« less
    Free, publicly-accessible full text available June 27, 2023
  5. Free, publicly-accessible full text available August 1, 2023
  6. Continuous location authentication (CLA) seeks to continuously and automatically verify the physical presence of legitimate users in a protected indoor area. CLA can play an important role in contexts where access to electrical or physical resources must be limited to physically present legitimate users. In this paper, we present WearRF-CLA, a novel CLA scheme built upon increasingly popular wrist wearables and UHF RFID systems. WearRF-CLA explores the observation that human daily routines in a protected indoor area comprise a sequence of human-states (e.g., walking and sitting) that follow predictable state transitions. Each legitimate WearRF-CLA user registers his/her RFID tag and also wrist wearable during system enrollment. After the user enters a protected area, WearRF-CLA continuously collects and processes the gyroscope data of the wrist wearable and the phase data of the RFID tag signals to verify three factors to determine the user's physical presence/absence without explicit user involvement: (1) the tag ID as in a traditional RFID authentication system, (2) the validity of the human-state chain, and (3) the continuous coexistence of the paired wrist wearable and RFID tag with the user. The user passes CLA if and only if all three factors can be validated. Extensive user experiments onmore »commodity smartwatches and UHF RFID devices confirm the very high security and low authentication latency of WearRF-CLA.« less
  7. Free, publicly-accessible full text available June 14, 2023
  8. Abstract

    Epitranscriptomic RNA modifications can regulate fundamental biological processes, but we lack approaches to map modification sites and probe writer enzymes. Here we present a chemoproteomic strategy to characterize RNA 5-methylcytidine (m5C) dioxygenase enzymes in their native context based upon metabolic labeling and activity-based crosslinking with 5-ethynylcytidine (5-EC). We profile m5C dioxygenases in human cells including ALKBH1 and TET2 and show that ALKBH1 is the major hm5C- and f5C-forming enzyme in RNA. Further, we map ALKBH1 modification sites transcriptome-wide using 5-EC-iCLIP and ARP-based sequencing to identify ALKBH1-dependent m5C oxidation in a variety of tRNAs and mRNAs and analyze ALKBH1 substrate specificity in vitro. We also apply targeted pyridine borane-mediated sequencing to measure f5C sites on select tRNA. Finally, we show that f5C at the wobble position of tRNA-Leu-CAA plays a role in decoding Leu codons under stress. Our work provides powerful chemical approaches for studying RNA m5C dioxygenases and mapping oxidative m5C modifications and reveals the existence of novel epitranscriptomic pathways for regulating RNA function.

  9. In the past decade, Deep Neural Networks (DNNs), e.g., Convolutional Neural Networks, achieved human-level performance in vision tasks such as object classification and detection. However, DNNs are known to be computationally expensive and thus hard to be deployed in real-time and edge applications. Many previous works have focused on DNN model compression to obtain smaller parameter sizes and consequently, less computational cost. Such methods, however, often introduce noticeable accuracy degradation. In this work, we optimize a state-of-the-art DNN-based video detection framework—Deep Feature Flow (DFF) from the cloud end using three proposed ideas. First, we propose Asynchronous DFF (ADFF) to asynchronously execute the neural networks. Second, we propose a Video-based Dynamic Scheduling (VDS) method that decides the detection frequency based on the magnitude of movement between video frames. Last, we propose Spatial Sparsity Inference, which only performs the inference on part of the video frame and thus reduces the computation cost. According to our experimental results, ADFF can reduce the bottleneck latency from 89 to 19 ms. VDS increases the detection accuracy by 0.6% mAP without increasing computation cost. And SSI further saves 0.2 ms with a 0.6% mAP degradation of detection accuracy.
    Free, publicly-accessible full text available May 31, 2023