NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment

Wang, T; Gui, D; Hu, Y; Lin, S; Zhang, L (July 2025, Proceedings of Machine Learning Research)

Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning large language models (LLMs). Yet its reliance on a singular reward model often overlooks the diversity of human preferences. Recent approaches address this limitation by leveraging multi-dimensional feedback to fine-tune corresponding reward models and train LLMs using reinforcement learning. However, the process is costly and unstable, especially given the competing and heterogeneous nature of human preferences. In this paper, we propose Mixing Preference Optimization (MPO), a post-processing framework for aggregating single-objective policies as an alternative to both multi-objective RLHF (MORLHF) and MaxMin-RLHF. MPO avoids alignment from scratch. Instead, it log-linearly combines existing policies into a unified one with the weight of each policy computed via a batch stochastic mirror descent. Empirical results demonstrate that MPO achieves balanced performance across diverse preferences, outperforming or matching existing models with significantly reduced computational costs.
more » « less
Free, publicly-accessible full text available July 17, 2026
ROSERAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization

Liu, T; Jiang, H; Wang, T; Xu, R; Yu, Y; Zhang, L; Zhao, T; Wang, H (July 2025, Proceedings of Machine Learning Research)

Large language models (LLMs) have achieved impressive performance but face high computational costs and latency, limiting their deployment in resource-constrained settings. In contrast, small-scale LLMs (SLMs) are more efficient yet struggle to capture evolving real-world knowledge. Retrieval-augmented generation (RAG) helps by integrating external knowledge, but imperfect retrieval can introduce distracting noise that misleads SLMs. We propose {\name}, a robust RAG framework for SLMs via Margin-aware Preference Optimization. {\name} employs multi-turn prompting for detailed reasoning, rejection sampling for high-quality explanations, and contrastive preference selection to refine responses by maximizing the likelihood gap between preferred and non-preferred outputs.
more » « less
Free, publicly-accessible full text available July 17, 2026
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning

Gao, F; Zhou, R; Wang, T; Shen, C; Yang, J (April 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available April 24, 2026
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning

Gao, F; Zhou, R; Wang, T; Shen, C; Yang, J (April 2025, The Fourteenth International Conference on Learning Representations)

Free, publicly-accessible full text available April 14, 2026
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models

Xia, P; Zhu, K; Li, H; Wang, T; Shi, W; Wang, S; Zhang, L; Zou, J; Yao, H (April 2025, ICLR)

Artificial Intelligence (AI) has demonstrated significant potential in healthcare, particularly in disease diagnosis and treatment planning. Recent progress in Medical Large Vision-Language Models (Med-LVLMs) has opened up new possibilities for interactive diagnostic tools. However, these models often suffer from factual hallucination, which can lead to incorrect diagnoses. Fine-tuning and retrieval-augmented generation (RAG) have emerged as methods to address these issues. However, the amount of high-quality data and distribution shifts between training data and deployment data limit the application of fine-tuning methods. Although RAG is lightweight and effective, existing RAG-based approaches are not sufficiently general to different medical domains and can potentially cause misalignment issues, both between modalities and between the model and the ground truth. In this paper, we propose a versatile multimodal RAG system, MMed-RAG, designed to enhance the factuality of Med-LVLMs. Our approach introduces a domain-aware retrieval mechanism, an adaptive retrieved contexts selection, and a provable RAG-based preference fine-tuning strategy. These innovations make the RAG process sufficiently general and reliable, significantly improving alignment when introducing retrieved contexts. Experimental results across five medical datasets (involving radiology, ophthalmology, pathology) on medical VQA and report generation demonstrate that MMed-RAG can achieve an average improvement of 43.8% in factual accuracy in the factual accuracy of Med-LVLMs.
more » « less
Free, publicly-accessible full text available April 24, 2026
Isolation and characterization of ice recrystallization inhibitory molecules from black soldier fly larvae

https://doi.org/10.1163/23524588-00001343

Fomich, M; Yuan, Y; Krishnan, H; Dia, V; Wang, T (February 2025, Journal of Insects as Food and Feed)

Abstract Black soldier fly larvae (BSFL) have demonstrated cold tolerance that suggests the presence of cryoprotective molecules. The objective of this research was to investigate if the proteins present in the BSFL have ice recrystallization inhibition (IRI) activity and how different environmental factors affect the activity. Osborne fractionation of the defatted BSFL was performed to separate the proteins based on solubility, then preparative size exclusion chromatography was used to fractionate the albumin fraction by molecular size to isolate IRI or ice binding proteins. The major proteins in the active fractions were identified by mass spectrometry, and molecular dynamic simulations were performed with two proteins identified to investigate their behaviors in an ice-water system. The main finding is the strong IRI activity of the water-soluble BSFL albumin fraction and the column fractionated fraction 1. This fraction had a 40.4-79.9% reduction in ice crystal size at 1% concentration and under a wide pH (3-9) and salt (10-200 mM NaCl) concentration. Pure proteins recovered were sequenced and identified as cuticle proteins by mass spectrometry. One cuticle protein demonstrated strong H-bonding and structural flexibility by molecular dynamic simulations, explaining the IRI and ice binding activity. This is the first time BSFL protein is reported to possess IRI activity, and such protein extract can be feasibly obtained compared to other naturally occurring antifreezing proteins.
more » « less
Free, publicly-accessible full text available February 19, 2026
RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models

Jiang, T; Li, C; Ma, F; Wang, T (January 2025, International Conference on Learning Representations)

Free, publicly-accessible full text available January 1, 2026
Improving Decision Sparsity

Sun, Y; Wang, T; Rudin, C (December 2024, NeurIPS)

Sparsity is a central aspect of interpretability in machine learning. Typically, sparsity is measured in terms of the size of a model globally, such as the number of variables it uses. However, this notion of sparsity is not particularly relevant for decision making; someone subjected to a decision does not care about variables that do not contribute to the decision. In this work, we dramatically expand a notion of decision sparsity called the Sparse Explanation Value (SEV) so that its explanations are more meaningful. SEV considers movement along a hypercube towards a reference point. By allowing flexibility in that reference and by considering how distances along the hypercube translate to distances in feature space, we can derive sparser and more meaningful explanations for various types of function classes. We present cluster-based SEV and its variant tree-based SEV, introduce a method that improves credibility of explanations, and propose algorithms that optimize decision sparsity in machine learning models.
more » « less
Free, publicly-accessible full text available December 1, 2025
RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction

Jiang, T; Wang, Z; Liang, J; Li, C; Wang, Y; Wang, T (January 2025, International Conference on Learning Representations)

Free, publicly-accessible full text available January 1, 2026
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models

Li, C; Pang, R; Cao, B; Chen, J; Ma, F; Ji, S; Wang, T (January 2025, USENIX Security Symposium (Security’25))

Free, publicly-accessible full text available January 1, 2026

« Prev Next »

Search for: All records