skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 29, 2026

Title: AI/ML curation of AI/ML training datasets
Award ID(s):
2030234
PAR ID:
10598775
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Sklivanitis, George; Markopoulos, Panagiotis; Ouyang, Bing
Publisher / Repository:
SPIE
Date Published:
ISBN:
9781510687097
Page Range / eLocation ID:
1
Format(s):
Medium: X
Location:
Orlando, United States
Sponsoring Org:
National Science Foundation
More Like this
  1. The Open Radio Access Network (RAN) paradigm is transforming cellular networks into a system of disaggregated, virtualized, and software-based components. These self-optimize the network through programmable, closed-loop control, leveraging Artificial Intelligence (AI) and Machine Learning (ML) routines. In this context, Deep Reinforcement Learning (DRL) has shown great potential in addressing complex resource allocation problems. However, DRL-based solutions are inherently hard to explain, which hinders their deployment and use in practice. In this paper, we propose EXPLORA, a framework that provides explainability of DRL-based control solutions for the Open RAN ecosystem. EXPLORA synthesizes network-oriented explanations based on an attributed graph that produces a link between the actions taken by a DRL agent (i.e., the nodes of the graph) and the input state space (i.e., the attributes of each node). This novel approach allows EXPLORA to explain models by providing information on the wireless context in which the DRL agent operates. EXPLORA is also designed to be lightweight for real-time operation. We prototype EXPLORA and test it experimentally on an O-RAN-compliant near-real-time RIC deployed on the Colosseum wireless network emulator. We evaluate EXPLORA for agents trained for different purposes and showcase how it generates clear network-oriented explanations. We also show how explanations can be used to perform informative and targeted intent-based action steering and achieve median transmission bitrate improvements of 4% and tail improvements of 10%. 
    more » « less
  2. With the growing adoption of privacy-preserving machine learning algorithms, such as Differentially Private Stochastic Gradient Descent (DP-SGD), training or fine-tuning models on private datasets has become increasingly prevalent. This shift has led to the need for models offering varying privacy guarantees and utility levels to satisfy diverse user requirements. Managing numerous versions of large models introduces significant operational challenges, including increased inference latency, higher resource consumption, and elevated costs. Model deduplication is a technique widely used by many model serving and database systems to support high-performance and low-cost inference queries and model diagnosis queries. However, none of the existing model deduplication works has considered privacy, leading to unbounded aggregation of privacy costs for certain deduplicated models and inefficiencies when applied to deduplicate DP-trained models. We formalize the problem of deduplicating DP-trained models for the first time and propose a novel privacy- and accuracy-aware deduplication mechanism to address the problem. We developed a greedy strategy to select and assign base models to target models to minimize storage and privacy costs. When deduplicating a target model, we dynamically schedule accuracy validations and apply the Sparse Vector Technique to reduce the privacy costs associated with private validation data. Compared to baselines, our approach improved the compression ratio by up to 35× for individual models (including large language models and vision transformers). We also observed up to 43× inference speedup due to the reduction of I/O operations. 
    more » « less