NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Explanations that reveal all through the definition of encoding

Puli, Aahlad; Nguyen, Nhi; Ranganath, Rajesh (December 2024, Neural Information Processing Systems)

Full Text Available
Explanations that reveal all through the definition of encoding

Puli, Aahlad; Nguyen, Nhi; Ranganath, Rajesh (December 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024).)

Feature attributions attempt to highlight what inputs drive predictive power. Good attributions or explanations are thus those that produce inputs that retain this predictive power; accordingly, evaluations of explanations score their quality of prediction. However, evaluations produce scores better than what appears possible from the values in the explanation for a class of explanations, called encoding explanations. Probing for encoding remains a challenge because there is no general characterization of what gives the extra predictive power. We develop a definition of encoding that identifies this extra predictive power via conditional dependence and show that the definition fits existing examples of encoding. This definition implies, in contrast to encoding explanations, that non-encoding explanations contain all the informative inputs used to produce the explanation, giving them a "what you see is what you get" property, which makes them transparent and simple to use. Next, we prove that existing scores (ROAR, FRESH, EVAL-X) do not rank non-encoding explanations above encoding ones, and develop STRIPE-X which ranks them correctly. After empirically demonstrating the theoretical insights, we use STRIPE-X to show that despite prompting an LLM to produce non-encoding explanations for a sentiment analysis task, the LLM-generated explanations encode.
more » « less
Full Text Available
Contrasting with Symile: Simple Model-Agnostic Representation Learning for Unlimited Modalities

Saporta, Adriel; Puli, Aahlad; Goldstein, Mark; Ranganath, Rajesh (December 2024, Neural Information Processing Systems)

Full Text Available
Contrasting with Symile: Simple Model-Agnostic Representation Learning for Unlimited Modalities

Saporta, Adriel; Puli, Aahlad; Goldstein, Mark; Ranganath, Rajesh (December 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Contrastive learning methods, such as CLIP, leverage naturally paired data-for example, images and their corresponding text captions-to learn general representations that transfer efficiently to downstream tasks. While such approaches are generally applied to two modalities, domains such as robotics, healthcare, and video need to support many types of data at once. We show that the pairwise application of CLIP fails to capture joint information between modalities, thereby limiting the quality of the learned representations. To address this issue, we present Symile, a simple contrastive learning approach that captures higher-order information between any number of modalities. Symile provides a flexible, architecture-agnostic objective for learning modality-specific representations. To develop Symile's objective, we derive a lower bound on total correlation, and show that Symile representations for any set of modalities form a sufficient statistic for predicting the remaining modalities. Symile outperforms pairwise CLIP, even with modalities missing in the data, on cross-modal classification and retrieval across several experiments including on an original multilingual dataset of 33M image, text and audio samples and a clinical dataset of chest X-rays, electrocardiograms, and laboratory measurements. All datasets and code used in this work are publicly available.
more » « less
Full Text Available
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation

Puli, Aahlad Manas; Joshi, Nitish; Wald, Yoav; He, He; Ranganath, Rajesh (June 2024, Transactions on machine learning research)

Full Text Available
Robust anomaly detection for particle physics using multi-background representation learning

https://doi.org/10.1088/2632-2153/ad780c

Gandrakota, Abhijith; Zhang, Lily_H; Puli, Aahlad; Cranmer, Kyle; Ngadiuba, Jennifer; Ranganath, Rajesh; Tran, Nhan (September 2024, Machine Learning: Science and Technology)

Abstract Anomaly, or out-of-distribution, detection is a promising tool for aiding discoveries of new particles or processes in particle physics. In this work, we identify and address two overlooked opportunities to improve anomaly detection (AD) for high-energy physics. First, rather than train a generative model on the single most dominant background process, we build detection algorithms using representation learning from multiple background types, thus taking advantage of more information to improve estimation of what is relevant for detection. Second, we generalize decorrelation to the multi-background setting, thus directly enforcing a more complete definition of robustness for AD. We demonstrate the benefit of the proposed robust multi-background AD algorithms on a high-dimensional dataset of particle decays at the Large Hadron Collider.
more » « less
Don’t blame dataset shift! shortcut learning due to gradients and cross entropy

Puli, Aahlad Manas; Zhang, Lily; Wald, Yoav; Ranganath, Rajesh (December 2023, Advances in Neural Information Processing Systems)

Full Text Available
When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations

Compton, Rhys; Zhang, Lily; Puli, Aahlad; Ranganath, Rajesh (August 2023, Machine Learning for Healthcare)

Full Text Available
DIET: Conditional independence testing with marginal dependence measures of residual information

Sudarshan, Mukund; Puli, Aahlad; Tansey, Wesley; Ranganath, Rajesh (April 2023, International Conference on Artificial Intelligence and Statistics)

Full Text Available
Development and external validation of a dynamic risk score for early prediction of cardiogenic shock in cardiac intensive care units using machine learning

https://doi.org/10.1093/ehjacc/zuae037

Hu, Yuxuan; Lui, Albert; Goldstein, Mark; Sudarshan, Mukund; Tinsay, Andrea; Tsui, Cindy; Maidman, Samuel D; Medamana, John; Jethani, Neil; Puli, Aahlad; et al (March 2024, European Heart Journal: Acute Cardiovascular Care)

Abstract AimsMyocardial infarction and heart failure are major cardiovascular diseases that affect millions of people in the USA with morbidity and mortality being highest among patients who develop cardiogenic shock. Early recognition of cardiogenic shock allows prompt implementation of treatment measures. Our objective is to develop a new dynamic risk score, called CShock, to improve early detection of cardiogenic shock in the cardiac intensive care unit (ICU). Methods and resultsWe developed and externally validated a deep learning-based risk stratification tool, called CShock, for patients admitted into the cardiac ICU with acute decompensated heart failure and/or myocardial infarction to predict the onset of cardiogenic shock. We prepared a cardiac ICU dataset using the Medical Information Mart for Intensive Care-III database by annotating with physician-adjudicated outcomes. This dataset which consisted of 1500 patients with 204 having cardiogenic/mixed shock was then used to train CShock. The features used to train the model for CShock included patient demographics, cardiac ICU admission diagnoses, routinely measured laboratory values and vital signs, and relevant features manually extracted from echocardiogram and left heart catheterization reports. We externally validated the risk model on the New York University (NYU) Langone Health cardiac ICU database which was also annotated with physician-adjudicated outcomes. The external validation cohort consisted of 131 patients with 25 patients experiencing cardiogenic/mixed shock. CShock achieved an area under the receiver operator characteristic curve (AUROC) of 0.821 (95% CI 0.792–0.850). CShock was externally validated in the more contemporary NYU cohort and achieved an AUROC of 0.800 (95% CI 0.717–0.884), demonstrating its generalizability in other cardiac ICUs. Having an elevated heart rate is most predictive of cardiogenic shock development based on Shapley values. The other top 10 predictors are having an admission diagnosis of myocardial infarction with ST-segment elevation, having an admission diagnosis of acute decompensated heart failure, Braden Scale, Glasgow Coma Scale, blood urea nitrogen, systolic blood pressure, serum chloride, serum sodium, and arterial blood pH. ConclusionThe novel CShock score has the potential to provide automated detection and early warning for cardiogenic shock and improve the outcomes for millions of patients who suffer from myocardial infarction and heart failure.
more » « less
Full Text Available

« Prev Next »

Search for: All records