NSF PAR Search | NSF Public Access Repository

Aligning large language models (LLMs) to human preferences is a crucial step in building helpful and safe AI tools, which usually involve training on supervised datasets. Popular algorithms such as Direct Preference Optimization (DPO) rely on pairs of AI-generated responses ranked according to human annotation. The response pair annotation process might bring human bias. Building a correct preference dataset is the costly part of the alignment pipeline. To improve annotation efficiency and quality in the LLMs alignment, we propose REAL:Response Embedding-based Alignment for LLMs, a strategy for constructing a high-quality training dataset that focuses on acquiring the less ambiguous preference pairs for labeling out of a set of response candidates. Our selection process is based on the similarity of embedding responses independently of prompts, which guarantees the selection process in an off-policy setting, avoiding adaptively measuring the similarity during the training. Experimental results on real-world dataset SHP2 and synthetic HH-RLHF benchmarks indicate that choosing dissimilar response pairs enhances the direct alignment of LLMs while reducing inherited labeling errors. The model aligned with dissimilar response pairs obtained a better margin and win rate on the dialogue task. Our findings suggest that focusing on distinct pairs can reduce the label error and improve LLM alignment efficiency, saving up to 65% of annotators’ work. The code of the work can be found https://github.com/ honggen-zhang/REAL-Alignment.

Enhancing Contrastive Representation Learning through Data

Zhang, Honggen (May 2025, ProQuest Dissertation)

Contrastive learning learns input representation by pushing similar data together and pulling dissimilar data away, along with data augmentation and pretext task construction. It enhances the large model learning due to its ability to use a large amount of unlabeled data. It has been suc- cessfully applied to large language models, pre-trained image models, and multimodal models. In addition, contrastive learning learns a representation from modeling the explainable structure of the latent space, which has a broad application in scientific discovery and interpretable Artificial Intelligence (AI). The primary focus of this thesis is to explore contrastive learning from a data construction perspective in real-world problems to fill the gap between the principle of contrastive learning and its application. The challenges, such as sampling bias and data quality, will largely affect the representations learned by contrastive learning. This thesis analyzes the data construction chanlledges and limitations in 1) the negative sampling of knowledge graph embedding (KGE), 2) high-quliaty preference data labeling of Large Language Models (LLMs) alignment, 3) data augmentation in Non-linear dynamic system modeling, and 4) data properties in functions of mesange RNA (mRNA) sequence. To solve the challenges 1), a hardness and structure-based objective function was proposed by considering sampling bias in hard negative sampling. For challenge 2), the similarity of response embedding is used to evaluate the quality of preference pairs to mitigate the labeling error of humans when they face an ambiguous response pair. Chal- lenge 3) is solved by systematically considering the physical system and contrastive learning. A data augmentation strategy by partitioning the full sequence is used for learning the transition matrix in the latent linear space. Challenge 4) is common to see in the biological domain due to the high cost of lab experiments. Pre-trained model will advantage the limited dataset su- pervised learning by learning general features from domain knowledge. A contrastive learning based teacher-student framework is proposed for mRNA sequence learning by contrasting the unmasked sequence and the hard-masked sequence. By providing careful data construction or data sampling, contrastive learning will be boosted to solve tasks in reality. For the KGE, the novel contrastive loss function learns the boundary between negative samples and positive samples to improve the link prediction task in the knowl- edge graph; For the LLM alignment, in the same labeling cost, the selected dissimilar responses will improve the vanilla direct preference optimization (DPO) alignment; The data augmentation with contrastive loss play crucial role to learn more accuracy dynamic system, which explained by the learned the continiues eigenfunction; By considering the tearch-student framework with hard-masked strategy, the pre-trained model achieve the state-of-the-art result by fine-tuning on limited downstrame task data. Overall, this thesis provides a broad data-driven contrastive learning methodology to enhance representation learning in different domains. The methodology consists of a imprived objective function in the face of data bias, a better data selection reducing labeling error, and proper data augmentation for a particular application domain. This methodology improve the learning result compare to traditional method.

Free, publicly-accessible full text available May 2, 2026

Search for: All records