NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Value of Out-of-distribution Data

De_Silva, Ashwin; Ramesh, Rahul; Priebe, Carey E; Chaudhari, Pratik; Vogelstein, Joshua T (December 2025, NeurIPS)

Full Text Available
Simple Lifelong Learning Machines

https://doi.org/10.1109/TPAMI.2025.3595364

Vogelstein, Joshua T; Dey, Jayanta; Helm, Hayden S; LeVine, Will; Mehta, Ronak D; Tomita, Tyler M; Xu, Haoyin; Geisa, Ali; Wang, Qingyang; van_de_Ven, Gido M; et al (November 2025, IEEE Transactions on Pattern Analysis and Machine Intelligence)

Full Text Available
Minimizing and quantifying uncertainty in AI-informed decisions: Applications in medicine

https://doi.org/10.1073/pnas.2424203122

Curtis, Samuel D; Panda, Sambit; Li, Adam; Xu, Haoyin; Bai, Yuxin; Ogihara, Itsuki; O’Reilly, Eliza; Wang, Yuxuan; Dobbyn, Lisa; Popoli, Maria; et al (August 2025, Proceedings of the National Academy of Sciences)

AI is now a cornerstone of modern dataset analysis. In many real world applications, practitioners are concerned with controlling specific kinds of errors, rather than minimizing the overall number of errors. For example, biomedical screening assays may primarily be concerned with mitigating the number of false positives rather than false negatives. Quantifying uncertainty in AI-based predictions, and in particular those controlling specific kinds of errors, remains theoretically and practically challenging. We develop a strategy called multidimensional informed generalized hypothesis testing (MIGHT) which we prove accurately quantifies uncertainty and confidence given sufficient data, and concomitantly controls for particular error types. Our key insight was that it is possible to integrate canonical cross-validation and parametric calibration procedures within a nonparametric ensemble method. Simulations demonstrate that while typical AI based-approaches cannot be trusted to obtain the truth, MIGHT can be. We apply MIGHT to answer an open question in liquid biopsies using circulating cell-free DNA (ccfDNA) in individuals with or without cancer: Which biomarkers, or combinations thereof, can we trust? Performance estimates produced by MIGHT on ccfDNA data have coefficients of variation that are often orders of magnitude lower than other state of the art algorithms such as support vector machines, random forests, and Transformers, while often also achieving higher sensitivity. We find that combinations of variable sets often decrease rather than increase sensitivity over the optimal single variable set because some variable sets add more noise than signal. This work demonstrates the importance of quantifying uncertainty and confidence—with theoretical guarantees—for the interpretation of real-world data.
more » « less
Full Text Available
Fragmentation signatures in cancer patients resemble those of patients with vascular or autoimmune diseases

https://doi.org/10.1073/pnas.2426890122

Curtis, Samuel D; Liu, Tingshan; Bai, Yuxin; Wang, Yuxuan; Panda, Sambit; Li, Adam; Xu, Haoyin; O’Reilly, Eliza; Dobbyn, Lisa; Popoli, Maria; et al (August 2025, Proceedings of the National Academy of Sciences)

Multiple case-controlled studies have shown that analyzing fragmentation patterns in plasma cell-free DNA (cfDNA) can distinguish individuals with cancer from healthy controls. However, there have been few studies that investigate various types of cfDNA fragmentomics patterns in individuals with other diseases. We therefore developed a comprehensive statistic, called fragmentation signatures, that integrates the distributions of fragment positioning, fragment length, and fragment end-motifs in cfDNA. We found that individuals with venous thromboembolism, systemic lupus erythematosus, dermatomyositis, or scleroderma have cfDNA fragmentation signatures that closely resemble those found in individuals with advanced cancers. Furthermore, these signatures were highly correlated with increases in inflammatory markers in the blood. We demonstrate that these similarities in fragmentation signatures lead to high rates of false positives in individuals with autoimmune or vascular disease when evaluated using conventional binary classification approaches for multicancer earlier detection (MCED). To address this issue, we introduced a multiclass approach for MCED that integrates fragmentation signatures with protein biomarkers and achieves improved specificity in individuals with autoimmune or vascular disease while maintaining high sensitivity. Though these data put substantial limitations on the specificity of fragmentomics-based tests for cancer diagnostics, they also offer ways to improve the interpretability of such tests. Moreover, we expect these results will lead to a better understanding of the process—most likely inflammatory—from which abnormal fragmentation signatures are derived.
more » « less
Full Text Available
Minimizing and quantifying uncertainty in AI-informed decisions: Applications in medicine

'Curtis, Samuel D; Panda, Sambit; Li, Adam; Xu, Haoyin; Bai, Yuxin; Ogihara, Itsuki; O’Reilly, Eliza; Wang, Yuxuan; Dobbyn, Lisa; Popoli, Maria; et al (August 2025, PNAS nexus)

Full Text Available
Fragmentation signatures in cancer patients resemble those of patients with vascular or autoimmune diseases

Curtis, Samuel D; Liu, Tingshan; Bai, Yuxin; Wang, Yuxuan; Panda, Sambit; Li, Adam; Xu, Haoyin; O’Reilly, Eliza; Dobbyn, Lisa; Popoli, Maria; et al (August 2025, PNAS nexus)

Full Text Available
Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures

https://doi.org/10.1007/978-3-032-00800-8_32

Yu, Siyu; Qin, Zihan; Liu, Tingshan; Xu, Beiya; Vogelstein, R Jacob; Brown, Jason; Vogelstein, Joshua T (August 2025, Springer Nature Switzerland)

Full Text Available
ginjax: E(d)-Equivariant CNN for Tensor Images

https://doi.org/10.21105/joss.08129

Gregory, Wilson G; Wong, Kaze_W K; Hogg, David W; Villar, Soledad (August 2025, Journal of Open Source Software)

Full Text Available
Simple Lifelong Learning Machines

Dey, Jayanta; Vogelstein, Joshua T; Helm, Hayden S; LeVine, Will; Mehta, Ronak D; Tomita, Tyler M; Xu, Haoyin; Geisa, Ali; Wang, Qingyang; van_de_Ven, Gido M; et al (July 2025, IEEE transactions on pattern analysis and machine intelligence)

Full Text Available
Prospective Learning in Retrospect

Bai, Yuxin; Shuai, Cecelia; De_Silva, Ashwin; Yu, Siyu; Chaudhari, Pratik; Vogelstein, Joshua T (July 2025, arxiv.org)

Full Text Available

« Prev Next »

Search for: All records