Search for: All records

Award ID contains: 2031883

« Prev Next »

Total Resources

46

Resource Type
Conference Paper

18

Conference Proceeding

0

Dataset

0

Journal Article

28

Workshop Report

0

Availability
Full Text / Resource Available

42

Citation Only

4

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Global eigenvalue fluctuations of random biregular bipartite graphs

https://doi.org/10.1142/S2010326323500041

Dumitriu, Ioana ; Zhu, Yizhe ( July 2023 , Random Matrices: Theory and Applications)

We compute the eigenvalue fluctuations of uniformly distributed random biregular bipartite graphs with fixed and growing degrees for a large class of analytic functions. As a key step in the proof, we obtain a total variation distance bound for the Poisson approximation of the number of cycles and cyclically non-backtracking walks in random biregular bipartite graphs, which might be of independent interest. We also prove a semicircle law for random [Formula: see text]-biregular bipartite graphs when [Formula: see text]. As an application, we translate the results to adjacency matrices of uniformly distributed random regular hypergraphs.
more » « less
Free, publicly-accessible full text available July 1, 2024
Fast Interpretable Greedy-Tree Sums (FIGS)

Tan, Yan Shuo ; Singh, Chandan ; Nasseri, Keyan ; Agarwal, Abhineet ; Duncan, James ; Ronen, Omer ; Epland, Matthew ; Kornblith, Aaron ; Yu, Bin ( July 2023 , ArXivorg)

Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in high-stakes domains such as medicine. In such settings, practitioners often use highly interpretable decision tree models, but these suffer from inductive bias against additive structure. To overcome this bias, we propose Fast Interpretable Greedy-Tree Sums (FIGS), which generalizes the CART algorithm to simultaneously grow a flexible number of trees in summation. By combining logical rules with addition, FIGS is able to adapt to additive structure while remaining highly interpretable. Extensive experiments on real-world datasets show that FIGS achieves state-of-the-art prediction performance. To demonstrate the usefulness of FIGS in high-stakes domains, we adapt FIGS to learn clinical decision instruments (CDIs), which are tools for guiding clinical decision-making. Specifically, we introduce a variant of FIGS known as G-FIGS that accounts for the heterogeneity in medical data. G-FIGS derives CDIs that reflect domain knowledge and enjoy improved specificity (by up to 20% over CART) without sacrificing sensitivity or interpretability. To provide further insight into FIGS, we prove that FIGS learns components of additive models, a property we refer to as disentanglement. Further, we show (under oracle conditions) that unconstrained tree-sum models leverage disentanglement to generalize more efficiently than single decision tree models when fitted to additive regression functions. Finally, to avoid overfitting with an unconstrained number of splits, we develop Bagging-FIGS, an ensemble version of FIGS that borrows the variance reduction techniques of random forests. Bagging-FIGS enjoys competitive performance with random forests and XGBoost on real-world datasets.
more » « less
Free, publicly-accessible full text available July 1, 2024
The quarks of attention: Structure and capacity of neural attention building blocks

https://doi.org/10.1016/j.artint.2023.103901

Baldi, Pierre ; Vershynin, Roman ( June 2023 , Artificial Intelligence)

Free, publicly-accessible full text available June 1, 2024
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data

Frei, Spencer ; Vardi, Gal ; Bartlett, Peter L. ; Srebro, Nathan ; Hu, Wei ( May 2023 , Proceedings of ICLR 2023)

Free, publicly-accessible full text available May 1, 2024
AVIDA: An alternating method for visualizing and integrating data

https://doi.org/10.1016/j.jocs.2023.101998

Dover, Kathryn ; Cang, Zixuan ; Ma, Anna ; Nie, Qing ; Vershynin, Roman ( April 2023 , Journal of Computational Science)

Full Text Available
AVIDA: Alternating method for Visualizing and Integrating Data

Dover, Kathryn ; Cang, Zixuan ; Ma, Anna ; Nie, Qing ; Vershynin, Roman ( April 2023 , Journal of computational science)

High-dimensional multimodal data arises in many scientific fields. The integration of multimodal data becomes challenging when there is no known correspondence between the samples and the features of different datasets. To tackle this challenge, we introduce AVIDA, a framework for simultaneously performing data alignment and dimension reduction. In the numerical experiments, Gromov-Wasserstein optimal transport and t-distributed stochastic neighbor embedding are used as the alignment and dimension reduction modules respectively. We show that AVIDA correctly aligns high-dimensional datasets without common features with four synthesized datasets and two real multimodal single-cell datasets. Compared to several existing methods, we demonstrate that AVIDA better preserves structures of individual datasets, especially distinct local structures in the joint low-dimensional visualization, while achieving comparable alignment performance. Such a property is important in multimodal single-cell data analysis as some biological processes are uniquely captured by one of the datasets. In general applications, other methods can be used for the alignment and dimension reduction modules.
more » « less
Full Text Available
Sparse recovery properties of discrete random matrices

Ferber, Asaf ; Sah, Ashwin ; Sawhney, Mehtaab ; Zhu, Yizhe ( January 2023 , Combinatorics probability and computing)

Full Text Available
MDI+: A Flexible Random Forest-Based Feature Importance Framework

Agarwal, Abhineet ; Kenney, Ana M. ; Tan, Yan Shuo ; Tang, Tiffany M. ; Yu, Bin ( January 2023 , arXivorg)

Full Text Available
A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

Ghosh, Nikhil ; Belkin, Mikhail ( January 2023 , arXivorg)

Full Text Available
Sparse random hypergraphs: Non-backtracking spectra and community detection

Stephan, Ludovic ; Zhu, Yizhe ( July 2022 , ArXivorg)

Full Text Available

« Prev Next »