NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Evaluation and Incident Prevention in an Enterprise AI Assistant

https://doi.org/10.1609/aaai.v39i28.35161

Maharaj, Akash V; Arbour, David; Lee, Daniel; Bhattacharya, Uttaran; Rao, Anup; Zane, Austin; Feller, Avi; Qian, Kun; Li, Yunyao (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Enterprise AI Assistants are increasingly deployed in domains where accuracy is paramount, making each erroneous output a potentially significant incident. This paper presents a comprehensive framework for monitoring, benchmarking, and continuously improving such complex, multi-component systems under active development by multiple teams. Our approach encompasses three key elements: (1) a hierarchical ``severity'' framework for incident detection that identifies and categorizes errors while attributing component-specific error rates, facilitating targeted improvements; (2) a scalable and principled methodology for benchmark construction, evaluation, and deployment, designed to accommodate multiple development teams, mitigate overfitting risks, and assess the downstream impact of system modifications; and (3) a continual improvement strategy leveraging multidimensional evaluation, enabling the identification and implementation of diverse enhancement opportunities. By adopting this holistic framework, organizations can systematically enhance the reliability and performance of their AI Assistants, ensuring their efficacy in critical enterprise environments. We conclude by discussing how this multifaceted evaluation approach opens avenues for various classes of enhancements, paving the way for more robust and trustworthy AI systems.
more » « less
Free, publicly-accessible full text available April 11, 2026
XOR Lemmas for Communication via Marginal Information

https://doi.org/10.1145/3618260.3649726

Iyer, Siddharth; Rao, Anup (June 2024, ACM)

Full Text Available
RECON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories

https://doi.org/10.1007/978-3-031-73202-7_17

Lu, Chen-Yi; Agarwal, Shubham; Tanjim, Md Mehrab; Mahadik, Kanak; Rao, Anup; Mitra, Subrata; Saini, Shiv Kumar; Bagchi, Saurabh; Chaterji, Somali (November 2024, Springer Nature Switzerland)

Full Text Available
An XOR Lemma for Deterministic Communication Complexity

https://doi.org/10.1109/FOCS61266.2024.00034

Iyer, Siddharth; Rao, Anup (October 2024, IEEE)

Full Text Available
A criterion for decoding on the binary symmetric channel

https://doi.org/10.3934/amc.2024007

Rao, Anup; Sprumont, Oscar (January 2024, Advances in Mathematics of Communications)

Full Text Available
Finite Population Regression Adjustment and Non-asymptotic Guarantees for Treatment Effect Estimation

Ghadiri, Mehrdad; Arbour, David; Mai, Tung; Musco, Cameron; Rao, Anup B (December 2023, Conference on Neural Information Processing Systems (NeurIPS) 2023)

Full Text Available
Optimal Sketching Bounds for Sparse Linear Regression

Mai, Tung; Munteanu, Alexander; Musco, Cameron; Rao, Anup B.; Schwiegelshohn, Chris; Woodruff, David P. (January 2023, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Coresets for Classification - Simplified and Strengthened

Mai, Tung; Musco, Cameron; Rao, Anup (December 2021, Advances in neural information processing systems)

We give relative error coresets for training linear classifiers with a broad class of loss functions, including the logistic loss and hinge loss. Our construction achieves $$(1\pm \epsilon)$$ relative error with $$\tilde O(d \cdot \mu_y(X)^2/\epsilon^2)$$ points, where $$\mu_y(X)$$ is a natural complexity measure of the data matrix $$X \in \mathbb{R}^{n \times d}$$ and label vector $$y \in \{-1,1\}^n$$, introduced in Munteanu et al. 2018. Our result is based on subsampling data points with probabilities proportional to their \textit{$$\ell_1$$ Lewis weights}. It significantly improves on existing theoretical bounds and performs well in practice, outperforming uniform subsampling along with other importance sampling methods. Our sampling distribution does not depend on the labels, so can be used for active learning. It also does not depend on the specific loss function, so a single coreset can be used in multiple training scenarios.
more » « less
Full Text Available
Sample Constrained Treatment Effect Estimation

Addanki, Raghavendra; Arbour, David; Mai, Tung; Musco, Cameron; Rao, Anup B. (January 2022, Conference on Neural Information Processing Systems (NeurIPS))

Full Text Available
Fundamental Tradeoffs in Distributionally Adversarial Training

Mehrabi, Mohammad; Javanmard, Adel; Rossi, Ryan A.; Rao, Anup; Mai, Tung (January 2021, International Conference on Machine Learning, PMLR)

Full Text Available

« Prev Next »

Search for: All records