Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The rapid advancement of large language model (LLM) agents has raised new concerns regarding their safety and security, which cannot be addressed by traditional textual-harm-focused LLM guardrails. We propose GuardAgent, the first guardrail agent to protect other agents by checking whether the agent actions satisfy safety guard requests. Specifically, GuardAgent first analyzes the safety guard requests to generate a task plan, and then converts this plan into guardrail code for execution. In both steps, an LLM is utilized as the reasoning component, supplemented by in-context demonstrations retrieved from a memory module storing information from previous tasks. GuardAgent can understand different safety guard requests and provide reliable code-based guardrails with high flexibility and low operational overhead. In addition, we propose two novel benchmarks: EICU-AC benchmark to assess the access control for healthcare agents and Mind2Web-SC benchmark to evaluate the safety regulations for web agents. We show that GuardAgent effectively moderates the violation actions for two types of agents on these two benchmarks with over 98% and 83% guardrail accuracies, respectively.more » « less
-
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data. However, this often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns. Driven by these concerns, machine unlearning has become crucial to effectively purge undesirable knowledge from models. While existing literature has studied various unlearning techniques, these often suffer from either poor unlearning quality or degradation in text-image alignment after unlearning, due to the competitive nature of these objectives. To address these challenges, we propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives. We further derive the characterization of such an update. In addition, we design procedures to strategically diversify the unlearning and remaining datasets to boost performance improvement. Our evaluation demonstrates that our method effectively removes target classes from recent diffusion-based generative models and concepts from stable diffusion models while maintaining close alignment with the models' original trained states, thus outperforming state-of-the-art baselines.more » « less
-
Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis. Their profound capabilities in processing and interpreting complex language data, however, bring to light pressing concerns regarding data privacy, especially the risk of unintentional training data leakage. Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs. Addressing this gap, our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs. LLM-PBE is designed to analyze privacy across the entire lifecycle of LLMs, incorporating diverse attack and defense strategies, and handling various data types and metrics. Through detailed experimentation with multiple LLMs, LLM-PBE facilitates an in-depth exploration of data privacy concerns, shedding light on influential factors such as model size, data characteristics, and evolving temporal dimensions. This study not only enriches the understanding of privacy issues in LLMs but also serves as a vital resource for future research in the field. Aimed at enhancing the breadth of knowledge in this area, the findings, resources, and our full technical report are made available at https://llm-pbe.github.io/, providing an open platform for academic and practical advancements in LLM privacy assessment.more » « less
-
Federated learning (FL) enables distributed resource- constrained devices to jointly train shared models while keeping the training data local for privacy purposes. Vertical FL (VFL), which allows each client to collect partial features, has attracted intensive research efforts recently. We identified the main challenges that existing VFL frameworks are facing: the server needs to communicate gradients with the clients for each training step, incurring high communication cost that leads to rapid consumption of privacy budgets. To address these challenges, in this paper, we introduce a VFL framework with multiple heads (VIM ), which takes the separate contribution of each client into account, and enables an efficient decomposition of the VFL optimization objective to sub-objectives that can be iteratively tackled by the server and the clients on their own. In particular, we propose an Alternating Direction Method of Multipliers (ADMM)- based method to solve our optimization problem, which allows clients to conduct multiple local updates before communication, and thus reduces the communication cost and leads to better performance under differential privacy (DP). We provide the client-level DP mechanism for our framework to protect user privacy. Moreover, we show that a byproduct of VIM is that the weights of learned heads reflect the importance of local clients. We conduct extensive evaluations and show that on four vertical FL datasets, VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art. We also explicitly evaluate the importance of local clients and show that VIM enables functionalities such as client-level explanation and client denoising. We hope this work will shed light on a new way of effective VFL training and understanding.more » « less
-
Abstract. The interactions between aerosols and ice clouds represent one of the largest uncertainties in global radiative forcing from pre-industrial time to the present. In particular, the impact of aerosols on ice crystal effective radius (Rei), which is a key parameter determining ice clouds' net radiative effect, is highly uncertain due to limited and conflicting observational evidence. Here we investigate the effects of aerosols on Rei under different meteorological conditions using 9-year satellite observations. We find that the responses of Rei to aerosol loadings are modulated by water vapor amount in conjunction with several other meteorological parameters. While there is a significant negative correlation between Rei and aerosol loading in moist conditions, consistent with the "Twomey effect" for liquid clouds, a strong positive correlation between the two occurs in dry conditions. Simulations based on a cloud parcel model suggest that water vapor modulates the relative importance of different ice nucleation modes, leading to the opposite aerosol impacts between moist and dry conditions. When ice clouds are decomposed into those generated from deep convection and formed in situ, the water vapor modulation remains in effect for both ice cloud types, although the sensitivities of Rei to aerosols differ noticeably between them due to distinct formation mechanisms. The water vapor modulation can largely explain the difference in the responses of Rei to aerosol loadings in various seasons. A proper representation of the water vapor modulation is essential for an accurate estimate of aerosol–cloud radiative forcing produced by ice clouds.more » « less
An official website of the United States government

Full Text Available