NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models

Pan, Zhuoshi; Yao, Yuguang; Liu, Gaowen; Shen, Bingquan; Zhao, H Vicky; Kompella, Ramana Rao; Liu, Sijia (December 2024, neurips)

Full Text Available
Rethinking machine unlearning for large language models

https://doi.org/10.1038/s42256-025-00985-0

Liu, Sijia; Yao, Yuanshun; Jia, Jinghan; Casper, Stephen; Baracaldo, Nathalie; Hase, Peter; Yao, Yuguang; Liu, Chris Yuhao; Xu, Xiaojun; Li, Hang; et al (February 2025, Nature Machine Intelligence)

Free, publicly-accessible full text available February 1, 2026
Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Pal, Soumyadeep; Yao, Yuguang; Wang, Ren; Shen, Bingquan; Liu, Sijia (May 2024, The Twelfth International Conference on Learning Representations)

Modern machine learning (ML) systems demand substantial training data, often resorting to external sources. Nevertheless, this practice renders them vulnerable to backdoor poisoning attacks. Prior backdoor defense strategies have primarily focused on the identification of backdoored models or poisoned data characteristics, typically operating under the assumption of access to clean data. In this work, we delve into a relatively underexplored challenge: the automatic identification of backdoor data within a poisoned dataset, all under realistic conditions, i.e., without the need for additional clean data or without manually defining a threshold for backdoor detection. We draw an inspiration from the scaled prediction consistency (SPC) technique, which exploits the prediction invariance of poisoned data to an input scaling factor. Based on this, we pose the backdoor data identification problem as a hierarchical data splitting optimization problem, leveraging a novel SPC-based loss function as the primary optimization objective. Our innovation unfolds in several key aspects. First, we revisit the vanilla SPC method, unveiling its limitations in addressing the proposed backdoor identification problem. Subsequently, we develop a bi-level optimization-based approach to precisely identify backdoor data by minimizing the advanced SPC loss. Finally, we demonstrate the efficacy of our proposal against a spectrum of backdoor attacks, encompassing basic label-corrupted attacks as well as more sophisticated clean-label attacks, evaluated across various benchmark datasets. Experiment results show that our approach often surpasses the performance of current baselines in identifying backdoor data points, resulting in about 4%-36% improvement in average AUROC.
more » « less
Full Text Available
An Introduction to Bilevel Optimization: Foundations and applications in signal processing and machine learning

https://doi.org/10.1109/MSP.2024.3358284

Zhang, Yihua; Khanduri, Prashant; Tsaknakis, Ioannis; Yao, Yuguang; Hong, Mingyi; Liu, Sijia (April 2024, IEEE Signal Processing Magazine)

Full Text Available
Model Sparsity Can Simplify Machine Unlearning

Jia, Jinghan; Liu, Jiancheng; Ram, Parikshit; Yao, Yuguang; Liu, Gaowen; Liu, Yang; Sharma, Pranay; Liu, Sijia (December 2023, The Thirty-eighth Annual Conference on Neural Information Processing Systems)

In response to recent data regulation requirements, machine unlearning (MU) has emerged as a critical process to remove the influence of specific examples from a given model. Although exact unlearning can be achieved through complete model retraining using the remaining dataset, the associated computational costs have driven the development of efficient, approximate unlearning techniques. Moving beyond data-centric MU approaches, our study introduces a novel model-based perspective: model sparsification via weight pruning, which is capable of reducing the gap between exact unlearning and approximate unlearning. We show in both theory and practice that model sparsity can boost the multi-criteria unlearning performance of an approximate unlearner, closing the approximation gap, while continuing to be efficient. This leads to a new MU paradigm, termed prune first, then unlearn, which infuses a sparse model prior into the unlearning process. Building on this insight, we also develop a sparsity-aware unlearning method that utilizes sparsity regularization to enhance the training process of approximate unlearning. Extensive experiments show that our proposals consistently benefit MU in various unlearning scenarios. A notable highlight is the 77% unlearning efficacy gain of fine-tuning (one of the simplest unlearning methods) when using sparsity-aware unlearning. Furthermore, we demonstrate the practical impact of our proposed MU methods in addressing other machine learning challenges, such as defending against backdoor attacks and enhancing transfer learning. Codes are available at this https URL.
more » « less
Full Text Available
Advancing Model Pruning via Bi-level Optimization

Zhang, Yihua; Yao, Yuguang; Ram, Parikshit; Zhao, Pu; Chen, Tianlong; Hong, Mingyi; Wang, Yanzhi; Liu, Sijia (December 2022, 36th Conference on Neural Information Processing Systems (NeurIPS 2022))
Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

https://doi.org/10.24963/ijcai.2022/239

Zhao, Pu; Ram, Parikshit; Lu, Songtao; Yao, Yuguang; Bouneffouf, Djallel; Lin, Xue; Liu, Sijia (July 2022, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22))

Adversarial perturbations are critical for certifying the robustness of deep learning models. A ``universal adversarial perturbation'' (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image resolutions). Towards an authentic universality across image sources, we take a novel view of UAP generation as a customized instance of ``few-shot learning'', which leverages bilevel optimization and learning-to-optimize (L2O) techniques for UAP generation with improved attack success rate (ASR). We begin by considering the popular model agnostic meta-learning (MAML) framework to meta-learn a UAP generator. However, we see that the MAML framework does not directly offer the universal attack across image sources, requiring us to integrate it with another meta-learning framework of L2O. The resulting scheme for meta-learning a UAP generator (i) has better performance (50% higher ASR) than baselines such as Projected Gradient Descent, (ii) has better performance (37% faster) than the vanilla L2O and MAML frameworks (when applicable), and (iii) is able to simultaneously handle UAP generation for different victim models and data sources.
more » « less
Full Text Available
DeepLoRa: Learning Accurate Path Loss Model for Long Distance Links in LPWAN

https://doi.org/10.1109/INFOCOM42981.2021.9488784

Liu, Li; Yao, Yuguang; Cao, Zhichao; Zhang, Mi (May 2021, INFOCOM)
null (Ed.)
Full Text Available
Wi-Fi See It All: Generative Adversarial Network-augmented Versatile Wi-Fi Imaging

https://doi.org/10.1145/3384419.3430725

Li, Chenning; Liu, Zheng; Yao, Yuguang; Cao, Zhichao; Zhang, Mi; Liu, Yunhao (January 2020, ACM Conference on Embedded Networked Sensor Systems (SenSys'20))

Full Text Available

Search for: All records