NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

It is Hard to Unlearn Dogged Backdoor Samples in Diffusion Models

Huang, An; Xiong, Zuobin; Ye, Muchao; Son, Junggab (October 2025, Curran Associates, Inc.)

Free, publicly-accessible full text available October 23, 2026
Shadow-Activated Backdoor Attacks on Multimodal Large Language Models

https://doi.org/10.18653/v1/2025.findings-acl.248

Yin, Ziyi; Ye, Muchao; Cao, Yuanpu; Wang, Jiaqi; Chang, Aofei; Liu, Han; Chen, Jinghui; Wang, Ting; Ma, Fenglong (January 2025, Association for Computational Linguistics)

Full Text Available
UniT: A Unified Look at Certified Robust Training against Text Adversarial Perturbation

Ye, Muchao; Yin, Ziyi; Zhang, Tianrong; Du, Tianyu; Chen, Jinghui; Wang, Ting; Ma, Fenglong (September 2024, Annual Conference on Neural Information Processing Systems)

Full Text Available
Recent Advances in Predictive Modeling with Electronic Health Records

https://doi.org/10.24963/ijcai.2024/914

Wang, Jiaqi; Luo, Junyu; Ye, Muchao; Wang, Xiaochen; Zhong, Yuan; Chang, Aofei; Huang, Guanjie; Yin, Ziyi; Xiao, Cao; Sun, Jimeng; et al (August 2024, International Joint Conferences on Artificial Intelligence Organization)

The development of electronic health records (EHR) systems has enabled the collection of a vast amount of digitized patient data. However, utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. With the advancements in machine learning techniques, deep learning has demonstrated its superiority in various applications, including healthcare. This survey systematically reviews recent advances in deep learning-based predictive models using EHR data. Specifically, we introduce the background of EHR data and provide a mathematical definition of the predictive modeling task. We then categorize and summarize predictive deep models from multiple perspectives. Furthermore, we present benchmarks and toolkits relevant to predictive modeling in healthcare. Finally, we conclude this survey by discussing open challenges and suggesting promising directions for future research.
more » « less
Full Text Available
VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models

https://doi.org/10.1609/aaai.v38i7.28499

Yin, Ziyi; Ye, Muchao; Zhang, Tianrong; Wang, Jiaqi; Liu, Han; Chen, Jinghui; Wang, Ting; Ma, Fenglong (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pre-trained multimodal source model to create adversarial image-text pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both im- age and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representation-based loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer anti-recovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with state-of-the-art baselines. This work revealsa significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. The source code can be found in the link https://github.com/ericyinyzy/VQAttack.
more » « less
Full Text Available
PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

https://doi.org/10.1145/3580305.3599461

Ye, Muchao; Chen, Jinghui; Miao, Chenglin; Liu, Han; Wang, Ting; Ma, Fenglong (January 2023, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available
UniT: A Unified Look at Certified Robust Training against Text Adversarial Perturbation

Ye, Muchao; Yin, Ziyi; Zhang, Tianrong; Du, Tianyu; Chen, Jinghui; Wang, Ting; Ma, Fenglong (January 2023, Annual Conference on Neural Information Processing Systems (NeurIPS’23))

Full Text Available
MedSkim: Denoised Health Risk Prediction via Skimming Medical Claims Data

https://doi.org/10.1109/ICDM54844.2022.00018

Cui, Suhan; Luo, Junyu; Ye, Muchao; Wang, Jiaqi; Wang, Ting; Ma, Fenglong (November 2022, IEEE International Conference on Data Mining (ICDM))

Full Text Available
TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text

https://doi.org/10.1609/aaai.v36i4.20303

Ye, Muchao; Miao, Chenglin; Wang, Ting; Ma, Fenglong (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

This paper focuses on a newly challenging setting in hard-label adversarial attacks on text data by taking the budget information into account. Although existing approaches can successfully generate adversarial examples in the hard-label setting, they follow an ideal assumption that the victim model does not restrict the number of queries. However, in real-world applications the query budget is usually tight or limited. Moreover, existing hard-label adversarial attack techniques use the genetic algorithm to optimize discrete text data by maintaining a number of adversarial candidates during optimization, which can lead to the problem of generating low-quality adversarial examples in the tight-budget setting. To solve this problem, in this paper, we propose a new method named TextHoaxer by formulating the budgeted hard-label adversarial attack task on text data as a gradient-based optimization problem of perturbation matrix in the continuous word embedding space. Compared with the genetic algorithm-based optimization, our solution only uses a single initialized adversarial example as the adversarial candidate for optimization, which significantly reduces the number of queries. The optimization is guided by a new objective function consisting of three terms, i.e., semantic similarity term, pair-wise perturbation constraint, and sparsity constraint. Semantic similarity term and pair-wise perturbation constraint can ensure the high semantic similarity of adversarial examples from both comprehensive text-level and individual word-level, while the sparsity constraint explicitly restricts the number of perturbed words, which is also helpful for enhancing the quality of generated text. We conduct extensive experiments on eight text datasets against three representative natural language models, and experimental results show that TextHoaxer can generate high-quality adversarial examples with higher semantic similarity and lower perturbation rate under the tight-budget setting.
more » « less
Full Text Available
LeapAttack: Hard-Label Adversarial Attack on Text via Gradient-Based Optimization

https://doi.org/10.1145/3534678.3539357

Ye, Muchao; Chen, Jinghui; Miao, Chenglin; Wang, Ting; Ma, Fenglong (August 2022, The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available

Search for: All records