NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Advancing Adversarial Robustness in GNeRFs: The IL2-NeRF Attack

Meng, Nicole; Manicke, Caleb; Sahu, Ronak; Ding, Caiwen; Lao, Yingjie (June 2025, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR))

Free, publicly-accessible full text available June 14, 2026
HEDWIG: Homomorphic Encryption Accelerator Design Using BFV-HPS With HiGh-Speed Fixed-Point Approximation

https://doi.org/10.1145/3706628.3708853

Wang, Antian; Tan, Weihang; Xu, Zhenyu; Wei, Tao; Ding, Caiwen; Parhi, Keshab K; Lao, Yingjie (February 2025, ACM)

Free, publicly-accessible full text available February 27, 2026
Hardware Acceleration for Fully Homomorphic Encryption Scheme Switching from CKKS to FHEW

https://doi.org/10.1109/IEEECONF60004.2024.10942749

Zhang, Kaiyuan; Wang, Antian; Parhi, Keshab K; Lao, Yingjie (October 2024, IEEE)

Fully Homomorphic Encryption (FHE) presents a paradigm-shifting framework for performing computations on encrypted data, offering revolutionary implications for privacy-preserving technologies. This paper introduces a novel hardware implementation of scheme switching between two leading FHE schemes targeting different computational needs, i.e., arithmetic HE scheme CKKS, and Boolean HE scheme FHEW. The proposed architecture facilitates dynamic switching between the schemes with improved throughput and latency compared to the software baseline. The proposed architecture computation modules support scheme switching operations involving coefficient conversion, modular switching, and key switching. We also optimize the hardware designs for the pre-processing and post-processing blocks, involving key generation, encryption, and decryption. The effectiveness of our proposed design is verified on the Xilinx U280 Datacenter Acceleration FPGA. We demonstrate that the proposed scheme switching accelerator yields a 365× performance improvement over the software counterpart.
more » « less
Full Text Available
HERMES: Homomorphic Encryption over Residual Number System for Multi-level EvaluationS

https://doi.org/10.1145/3676536.3697124

Wang, Antian; Zhang, Kaiyuan; Parhi, Keshab K; Lao, Yingjie (October 2024, ACM)

Homomorphic encryption enables computations on the ciphertext to preserve data privacy. However, its practical deployment has been hindered by the significant computational overhead compared to the plaintext computations. In response to this challenge, we present HERMES, a novel hardware acceleration system designed to explore the computation flow of the CKKS homomorphic encryption bootstrapping process. Among the major contributions of our proposed architecture, we first analyze the properties of the CKKS computation data flow and propose a new scheduling strategy by partitioning the computation modules into general-purpose and special-purpose modular computation modules to allow smaller resource consumption and flexible scheduling. The computation modules are also reconfigurable to reduce the memory access overhead during the intermediate computation. We also optimize the CKKS computation dataflow to improve the regularity with reduced control overhead.
more » « less
Full Text Available
Area-Efficient Matrix-Vector Polynomial Multiplication Architecture for ML-KEM Using Interleaving and Folding Transformation

https://doi.org/10.1109/ISCAS58744.2024.10558584

Tan, Weihang; Lao, Yingjie; Parhi, Keshab K (May 2024, IEEE)

The ML-KEM post-quantum cryptography (PQC) scheme requires matrix-vector polynomial multiplication and polynomial arithmetic operations in the number theoretic transform (NTT) domain. Prior optimization approach KyberMat leverages the transposed-form fast filtering structure and sub-structure sharing technique, reducing the computational complexity. In this paper, a novel and area-efficient design builds upon the KyberMat framework, using the hierarchical interleaved folding algorithm to reduce hardware resources. Two design strategies are utilized in the proposed design. The proposed design initially scales down the NTT/inverse NTT processors via folding transformation, while utilizing a fixed number of DSPs and LUTs across different security levels of ML-KEM. This work further introduces a recursive summing unit along with the interleaving method to ensure continuous data processing and ultimately improve hardware utilization and throughput. The experimental result shows that our proposed area-efficient design achieves an average reduction of 71.55% in DSPs and 63.89% in LUTs among three different security levels, compared to the KyberMat framework.
more » « less
Full Text Available
Reliable Hardware Watermarks for Deep Learning Systems

https://doi.org/10.1109/TVLSI.2024.3360240

Clements, Joseph Franklin; Lao, Yingjie (April 2024, IEEE Transactions on Very Large Scale Integration (VLSI) Systems)

Full Text Available
Efficient Data Extraction Circuit for Posit Number System: LDD-Based Posit Decoder

https://doi.org/10.1109/TCAD.2023.3347295

Sun, Jianchi; Lao, Yingjie (December 2023, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems)

Full Text Available
Data-Driven Feature Selection Framework for Approximate Circuit Design

https://doi.org/10.1109/TCAD.2023.3260160

Zhao, Bingyin; Qiu, Ling; Lao, Yingjie (November 2023, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems)

Full Text Available
KyberMat: Efficient Accelerator for Matrix-Vector Polynomial Multiplication in CRYSTALS-Kyber Scheme via NTT and Polyphase Decomposition

https://doi.org/10.1109/ICCAD57390.2023.10323839

Tan, Weihang; Lao, Yingjie; Parhi, Keshab K. (October 2023, 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD))

CRYSTAL-Kyber (Kyber) is one of the post-quantum cryptography (PQC) key-encapsulation mechanism (KEM) schemes selected during the standardization process. This paper addresses optimization for Kyber architecture with respect to latency and throughput constraints. Specifically, matrix-vector multiplication and number theoretic transform (NTT)-based polynomial multiplication are critical operations and bottle-necks that require optimization. To address this challenge, we propose an algorithm and hardware co-design approach to systematically optimize matrix-vector multiplication and NTT-based polynomial multiplication by employing a novel sub-structure sharing technique in order to reduce computational complexity, i.e., the number of modular multiplications and modular additions/subtractions consumed. The sub-structure sharing approach is inspired by prior fast parallel approaches based on polyphase decomposition. The proposed efficient feed-forward architecture achieves high speed, low latency, and full utilization of all hardware components, which can significantly enhance the overall efficiency of the Kyber scheme. The FPGA implementation results show that our proposed design, using the fast two-parallel structure, leads to an approximate reduction of 90% in execution time (μs) , along with a 66× improvement in throughput performance.
more » « less
Full Text Available
Deep Neural Network Quantization Framework for Effective Defense against Membership Inference Attacks

https://doi.org/10.3390/s23187722

Famili, Azadeh; Lao, Yingjie (September 2023, Sensors)

Machine learning deployment on edge devices has faced challenges such as computational costs and privacy issues. Membership inference attack (MIA) refers to the attack where the adversary aims to infer whether a data sample belongs to the training set. In other words, user data privacy might be compromised by MIA from a well-trained model. Therefore, it is vital to have defense mechanisms in place to protect training data, especially in privacy-sensitive applications such as healthcare. This paper exploits the implications of quantization on privacy leakage and proposes a novel quantization method that enhances the resistance of a neural network against MIA. Recent studies have shown that model quantization leads to resistance against membership inference attacks. Existing quantization approaches primarily prioritize performance and energy efficiency; we propose a quantization framework with the main objective of boosting the resistance against membership inference attacks. Unlike conventional quantization methods whose primary objectives are compression or increased speed, our proposed quantization aims to provide defense against MIA. We evaluate the effectiveness of our methods on various popular benchmark datasets and model architectures. All popular evaluation metrics, including precision, recall, and F1-score, show improvement when compared to the full bitwidth model. For example, for ResNet on Cifar10, our experimental results show that our algorithm can reduce the attack accuracy of MIA by 14%, the true positive rate by 37%, and F1-score of members by 39% compared to the full bitwidth network. Here, reduction in true positive rate means the attacker will not be able to identify the training dataset members, which is the main goal of the MIA.
more » « less
Full Text Available

« Prev Next »

Search for: All records