NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Anytime Fairness Guarantees in Stochastic Combinatorial MABs: A Novel Learning Framework

Pokhriyal, Subham; Jain, Shweta; Ghalme, Ganesh; Aggarwal, Vaneet (May 2025, AAMAS 2025)

Free, publicly-accessible full text available May 22, 2026
Constrained Reinforcement Learning for Fair and Environmentally Efficient Traffic Signal Controllers

https://doi.org/10.1145/3676169

Haydari, Ammar; Aggarwal, Vaneet; Zhang, Michael; Chuah, Chen-Nee (March 2025, ACM Journal on Autonomous Transportation Systems)

Traffic signal controller (TSC) has a crucial role in managing traffic flow in urban areas. Recently, reinforcement learning (RL) models have received a great attention for TSC with promising results. However, these RL-TSC models still need to be improved for real-world deployment due to limited exploration of different performance metrics such as fair traffic scheduling or air quality impact. In this work, we introduce a constrained multi-objective RL model that minimizes multiple constrained objectives while achieving a higher expected reward. Furthermore, our proposed RL strategy integrates the peak and average constraint models to the RL problem formulation with maximum entropy off-policy models. We applied this strategy to a single TSC and a network of TSCs. As part of this constrained RL-TSC formulation, we discuss fairness and air quality parameters as constraints for the closed-loop control system optimization model at TSCs calledFAirLight. Our experimental analysis shows that the proposedFAirLightachieves a good traffic flow performance in terms of average waiting time while being fair and environmentally friendly. Our method outperforms the baseline models and allows a more comprehensive view of RL-TSC regarding its applicability to the real world.
more » « less
Free, publicly-accessible full text available March 31, 2026
Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment

https://doi.org/10.1609/aaai.v39i26.34979

Trivedi, Prashant; Chakraborty, Souradip; Reddy, Avinash; Aggarwal, Vaneet; Bedi, Amrit Singh; Atia, George K (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

The alignment of large language models (LLMs) with human values is critical as these models become increasingly integrated into various societal and decision-making processes. Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters, but these approaches are often computationally expensive and impractical when models are frozen or inaccessible for parameter modification. In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment. While the existing literature has shown empirical promise of prompt optimization, its theoretical underpinning remains under-explored. We address this gap by formulating prompt optimization as an optimization problem and try to provide theoretical insights into the optimality of such a framework. To analyze the performance of the prompt optimization, we study theoretical suboptimality bounds and provide insights in terms of how prompt optimization depends upon the given prompter and target model. We also provide empirical validation through experiments on various datasets, demonstrating that prompt optimization can effectively align LLMs, even when parameter fine-tuning is not feasible.
more » « less
Free, publicly-accessible full text available April 11, 2026
From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization

Pedramfar, Mohammad; Aggarwal, Vaneet (December 2024, Neuriips 2024)

Free, publicly-accessible full text available December 14, 2025
Gradient Methods for Online DR-Submodular Maximization with Stochastic Long-Term Constraints

Nie, Guanyu; Aggarwal, Vaneet; Quinn, Christopher (December 2024, Neurips 2024)

Free, publicly-accessible full text available December 12, 2025
Gradient Methods for Online DR-Submodular Maximization with Stochastic Long-Term Constraints

Nie, Guanyu; Aggarwal, Vaneet; Quinn, Christopher John (December 2024, The Thirty-Eighth Annual Conference on Neural Information Processing Systems)

In this paper, we consider the problem of online monotone DR-submodular maximization subject to long-term stochastic constraints. Specifically, at each round $$t\in [T]$$, after committing an action $$\bx_t$$, a random reward $$f_t(\bx_t)$$ and an unbiased gradient estimate of the point $$\widetilde{\nabla}f_t(\bx_t)$$ (semi-bandit feedback) are revealed. Meanwhile, a budget of $$g_t(\bx_t)$$, which is linear and stochastic, is consumed of its total allotted budget $$B_T$$. We propose a gradient ascent based algorithm that achieves $$\frac{1}{2}$$-regret of $$\mathcal{O}(\sqrt{T})$$ with $$\mathcal{O}(T^{3/4})$$ constraint violation with high probability. Moreover, when first-order full-information feedback is available, we propose an algorithm that achieves $(1-1/e)$-regret of $$\mathcal{O}(\sqrt{T})$$ with $$\mathcal{O}(T^{3/4})$$ constraint violation. These algorithms significantly improve over the state-of-the-art in terms of query complexity.
more » « less
Free, publicly-accessible full text available December 16, 2025
Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization

Pedramfar, Mohammad; Nadew, Yididiya Y; Quinn, Christopher John; Aggarwal, Vaneet (May 2024, The Twelfth International Conference on Learning Representations)

Full Text Available
Reinforced Sequential Decision-Making for Sepsis Treatment: The PosNegDM Framework With Mortality Classifier and Transformer

https://doi.org/10.1109/JBHI.2024.3377214

Tamboli, Dipesh; Chen, Jiayu; Jotheeswaran, Kiran Pranesh; Yu, Denny; Aggarwal, Vaneet (May 2024, IEEE Journal of Biomedical and Health Informatics)

Full Text Available
Combinatorial Stochastic-Greedy Bandit

https://doi.org/10.1609/aaai.v38i11.29093

Fourati, Fares; Quinn, Christopher John; Alouini, Mohamed-Slim; Aggarwal, Vaneet (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of n arms at each time step t in [T] is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existing methods that explore the entire set of unselected base arms during each selection step, our SGB algorithm samples only an optimized proportion of unselected arms and selects actions from this subset. We prove that our algorithm achieves a (1-1/e)-regret bound of O(n^(1/3) k^(2/3) T^(2/3) log(T)^(2/3)) for monotone stochastic submodular rewards, which outperforms the state-of-the-art in terms of the cardinality constraint k. Furthermore, we empirically evaluate the performance of our algorithm in the context of online constrained social influence maximization. Our results demonstrate that our proposed approach consistently outperforms the other algorithms, increasing the performance gap as k grows.
more » « less
Full Text Available
A Unified Approach for Maximizing Continuous DR-submodular Functions

Pedramfar, Mohammad; Quinn, Christopher; Aggarwal, Vaneet (December 2023, Advances in Neural Information Processing Systems 36 (NeurIPS 2023))

Full Text Available

« Prev Next »

Search for: All records