Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliabledetection mechanismfor the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required.more » « less
-
Continual learning (CL) aims to enable models to incrementally learn from a sequence of tasks without forgetting previously acquired knowledge. While most prior work focuses on closed-world settings, where all test instances are assumed from the set of learned classes, real-world applications require models to handle both CL and out-of-distribution (OOD) samples. A key insight from recent studies on deep neural networks is the phenomenon of Neural Collapse (NC), which occurs in the terminal phase of training when the loss approaches zero. Under NC, class features collapse to their means, and classifier weights align with these means, enabling effective prototype-based strategies such as nearest class mean, for both classification and OOD detection. However, in CL, catastrophic forgetting (CF) prevents the model from naturally reaching this desirable regime. In this paper, we propose a novel method called Analytic Neural Collapse (AnaNC) that analytically creates the NC properties in the feature space of a frozen pre-trained model with no training, overcoming CF. Extensive experiments demonstrate that our approach outperforms state-of-the-art methods in continual OOD detection and learning, highlighting the effectiveness of our method in this challenging scenario.more » « less
-
This paper studies the problem of class-incremental learning (CIL), a core setting within continual learning where a model learns a sequence of tasks, each containing a distinct set of classes. Traditional CIL methods, which do not leverage pretrained models (PTMs), suffer from catastrophic forgetting (CF) due to the need to incrementally learn both feature representations and the classifier. The integration of PTMs into CIL has recently led to efficient approaches that treat the PTM as a fixed feature extractor combined with analytic classifiers, achieving state-ofthe-art performance. However, they still face a major limitation: the inability to continually adapt feature representations to best suit the CIL tasks, leading to suboptimal performance. To address this, we propose AnaCP (Analytic Contrastive Projection), a novel method that preserves the efficiency of analytic classifiers while enabling incremental feature adaptation without gradient-based training, thereby eliminating the CF caused by gradient updates. Our experiments show that AnaCP not only outperforms existing baselines but also achieves the accuracy level of joint training, which is regarded as the upper bound of CIL.more » « less
-
The success of modern multimodal representation learning relies on internet-scale datasets. Due to the low quality of a large fraction of raw web data, data curation has become a critical step in the training pipeline. Filtering using a trained model (i.e., teacher-based filtering) has emerged as a successful solution, leveraging a pre-trained model to compute quality scores. To explain the empirical success of teacher-based filtering, we characterize the performance of filtered contrastive learning under the standard bimodal data generation model. Denoting as the fraction of data with correctly matched modalities among paired samples, we utilize a linear contrastive learning setup to show a provable benefit of data filtering: the error without filtering is upper and lower bounded by , and the error with teacher-based filtering is upper bounded by in the large regime, and by in the small regime.more » « less
-
Cipher transformations have been studied historically in cryptography, but little work has explored how large language models (LLMs) represent and process them. We evaluate the ability of three models: Llama 3.1, Gemma 2, and Qwen 3 on performing translation and dictionary tasks across ten cipher systems from a variety of families, and compare it against a commercially available model, GPT-5. Beyond task performance, we analyze embedding spaces of Llama variants to explore whether ciphers are internalized similarly to languages. Our findings suggest that cipher embeddings cluster together and, in some cases, overlap with lowerresource or less frequently represented languages. Steering-vector experiments further reveal that adjusting cipher-related directions in latent space can shift outputs toward these languages, suggesting shared representational structures. This study provides an initial framework for understanding how LLMs encode ciphers, bridging interpretability, and security. By framing ciphers in a similar way to languages, we highlight new directions for model analysis and for designing defenses against cipher-based jailbreaking attacks. We share our source code and data for reproduction research on GitHub.more » « less
-
Change point detection aims to identify abrupt shifts occurring at multiple points within a data sequence. This task becomes particularly challenging in the online setting, where different types of changes can occur, including shifts in both the marginal and joint distributions of the data. In this paper, we address these challenges by tracking the Riemannian geometry of correlation matrices, allowing Riemannian metrics to compute the geodesic distance as an accurate measure of correlation dynamics. We introduce Rio-CPD, a non-parametric, correlation-aware online change point detection framework that integrates the Riemannian geometry of the manifold of symmetric positive definite matrices with the cumulative sum (CUSUM) statistic for detecting change points. Rio-CPD employs a novel CUSUM design by computing the geodesic distance between current observations and the Fréchet mean of prior observations. With appropriate choices of Riemannian metrics, Rio-CPD offers a simple yet effective and computationally efficient algorithm. Experimental results on both synthetic and real-world datasets demonstrate that Rio-CPD outperforms existing methods on detection accuracy, average detection delay and efficiency.more » « less
-
Large language models (LLMs) can underpin AI assistants that help users with everyday tasks, such as by making recommendations or performing basic computation. Despite AI assistants’ promise, little is known about the implicit values these assistants display while completing subjective everyday tasks. Humans may consider values like environmentalism, charity, and diversity. To what extent do LLMs exhibit these values in completing everyday tasks? How do they compare with humans? We answer these questions by auditing how six popular LLMs complete 30 everyday tasks, comparing LLMs to each other and to 100 human crowdworkers from the US. We find LLMs often do not align with humans, nor with other LLMs, in the implicit values exhibited.more » « less
-
Graph neural networks (GNNs) find applications in various domains such as computational biology, natural language processing, and computer security. Owing to their popularity, there is an increasing need to explain GNN predictions since GNNs are black-box machine learning models. One way to address this issue involves usingcounterfactualreasoning where the objective is to alter the GNN prediction by minimal changes in the input graph. Existing methods for counterfactual explanation of GNNs are limited to instance-specificlocalreasoning. This approach has two major limitations of not being able to offer global recourse policies and overloading human cognitive ability with too much information. In this work, we study theglobalexplainability of GNNs through global counterfactual reasoning. Specifically, we want to find asmallset of representative counterfactual graphs that explainsallinput graphs. Toward this goal, we proposeGCFExplainer, a novel algorithm powered byvertex-reinforced random walkson anedit mapof graphs with agreedy summary. Extensive experiments on real graph datasets show that the global explanation fromGCFExplainerprovides important high-level insights of the model behavior and achieves a46.9%gain in recourse coverage, a9.5%reduction in recourse cost compared to the state-of-the-art local counterfactual explainers. We also demonstrate thatGCFExplainergenerates explanations that are more consistent with input dataset characteristics, and is robust under adversarial attacks. In addition,K-GCFExplainer, which incorporates a graph clustering component intoGCFExplainer, is introduced as a more competitive extension for datasets with a clustering structure, leading to superior performance in three out of four datasets in the experiments and better scalability.more » « less
-
In classic supervised learning, once a model is deployed in an application, it is fixed. No updates will be made to it during the application. This is inappropriate for many dynamic and open environments, where unexpected samples from unseen classes may appear. In such an environment, the model should be able to detect these novel samples from unseen classes and learn them after they are labeled. We call this paradigm Autonomous Learning after Model Deployment (ALMD). The learning here is continuous and involves no human engineers. Labeling in this scenario is performed by human co-workers or other knowledgeable agents, which is similar to what humans do when they encounter an unfamiliar object and ask another person for its name. In ALMD, the detection of novel samples is dynamic and differs from traditional out-of-distribution (OOD) detection in that the set of in-distribution (ID) classes expands as new classes are learned during application, whereas ID classes is fixed in traditional OOD detection. Learning is also different from classic supervised learning because in ALMD, we learn the encountered new classes immediately and incrementally. It is difficult to retrain the model from scratch using all the past data from the ID classes and the novel samples from newly discovered classes, as this would be resource- and time-consuming. Apart from these two challenges, ALMD faces the data scarcity issue because instances of new classes often appear sporadically in real-life applications. To address these issues, we propose a novel method, PLDA, which performs dynamic OOD detection and incremental learning of new classes on the fly. Empirical evaluations will demonstrate the effectiveness of PLDA.more » « less
-
Deep learning models are widely used in decision-making and recommendation systems, where they typically rely on the assumption of a static data distribution between training and deployment. However, real-world deployment environments often violate this assumption. Users who receive negative outcomes may adapt their features to meet model criteria, i.e., recourse action. These adaptive behaviors create shifts in the data distribution and when models are retrained on this shifted data, a feedback loop emerges: user behavior influences the model, and the updated model in turn reshapes future user behavior. Despite its importance, this bidirectional interaction between users and models has received limited attention. In this work, we develop a general framework to model user strategic behaviors and their interactions with decision-making systems under resource constraints and competitive dynamics. Both the theoretical and empirical analyses show that user recourse behavior tends to push logistic and MLP models toward increasingly higher decision standards, resulting in higher recourse costs and less reliable recourse actions over time. To mitigate these challenges, we propose two methods—Fair-top-k and Dynamic Continual Learning (DCL)—which significantly reduce recourse cost and improve model robustness. Our findings draw connections to economic theories, highlighting how algorithmic decision-making can unintentionally reinforce a higher standard and generate endogenous barriers to entry.more » « less
An official website of the United States government

Full Text Available