Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, and data privacy, to emerging problems like truthfulness and social norms. We critically analyze existing research aimed at understanding, examining, and mitigating these ethical risks. Our survey underscores integrating ethical standards and societal values into the development of LLMs, thereby guiding the development of responsible and ethically aligned language models.more » « less
-
Conformal prediction (CP), a distribution-free uncertainty quantification (UQ) framework, reliably provides valid predictive inference for black-box models. CP constructs prediction sets or intervals that contain the true output with a specified probability. However, modern data science’s diverse modalities, along with increasing data and model complexity, challenge traditional CP methods. These developments have spurred novel approaches to address evolving scenarios. This survey reviews the foundational concepts of CP and recent advancements from a data-centric perspective, including applications to structured, unstructured, and dynamic data. We also discuss the challenges and opportunities CP faces in large-scale data and models.more » « less
-
The rise of self-supervised learning(SSL), which operates without the need for labeled data, has garnered significant interest within the graph learning community. This enthusiasm has led to the development of numerous self-supervised learning methods, such as Graph Contrastive Learning (GCL) techniques, all aiming to create a versatile graph encoder that leverages the wealth of unlabeled data for various downstream tasks. However, the current evaluation standards for GCL approaches are flawed due to the need for extensive hyper-parameter tuning during pre-training and the reliance on a single downstream task for assessment. These flaws can skew the evaluation away from the intended goals, potentially leading to misleading conclusions. In our paper, we thoroughly examine these shortcomings and o!er fresh perspectives on how GCL methods are a!ected by hyperparameter choices and the choice of downstream tasks for their evaluation. Additionally, we introduce an enhanced evaluation framework designed to more accurately gauge the e!ectiveness, consistency, and overall capability of GCL methods. Our code implementation is available to ease reproducibility on https://github.com/GraphTL-Bench/BGPM.more » « less
An official website of the United States government

Full Text Available