skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2007481

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Current complex prediction models are the result of fitting deep neural networks, graph convolutional networks or transducers to a set of training data. A key challenge with these models is that they are highly parameterized, which makes describing and interpreting the prediction strategies difficult. We use topological data analysis to transform these complex prediction models into a simplified topological view of the prediction landscape. The result is a map of the predictions that enables inspection of the model results with more specificity than dimensionality-reduction methods such as tSNE and UMAP. The methods scale up to large datasets across different domains. We present a case study of a transformer-based model previously designed to predict expression levels of a piece of DNA in thousands of genomic tracks. When the model is used to study mutations in theBRCA1gene, our topological analysis shows that it is sensitive to the location of a mutation and the exon structure ofBRCA1in ways that cannot be found with tools based on dimensionality reduction. Moreover, the topological framework offers multiple ways to inspect results, including an error estimate that is more accurate than model uncertainty. Further studies show how these ideas produce useful results in graph-based learning and image classification. 
    more » « less
  2. A 2D tandem mass spectrum of a peptide mixture is organized into a graphical structure. A graph partitioning algorithm extracts the fragmentation trees of individual peptides, and ade novosequencing algorithm identifies the peptide sequences. 
    more » « less
    Free, publicly-accessible full text available October 8, 2026
  3. Saxena, Akrati (Ed.)
    We introduce a spatial graph and hypergraph model that smoothly interpolates between a graph with purely pairwise edges and a graph where all connections are in large hyperedges. The key component is a spatial clustering resolution parameter that varies between assigning all the vertices in a spatial region to individual clusters, resulting in the pairwise case, to assigning all the vertices in a spatial region to a single cluster, which results in the large hyperedge case. A key component of this model is that the spatial structure is invariant to the choice of hyperedges. Consequently, this model enables us to study clustering coefficients, graph diffusion, and epidemic spread and how their behavior changes as a function of the higher-order structure in the network with a fixed spatial substrate. We hope that our model will find future uses to distill or explain other behaviors in higher-order networks. 
    more » « less
    Free, publicly-accessible full text available September 12, 2026
  4. We demonstrate a spatial hypergraph model that allows us to vary the amount of higher-order structure in the generated hypergraph. Specifically, we can vary from a model that is a pure pairwise graph into a model that is almost a pure hypergraph. We use this spatial hypergraph model to study higher-order effects in epidemic spread. We use a susceptible-infected-recovered-susceptible (SIRS) epidemic model designed to mimic the spread of an airborne pathogen. We study three types of airborne effects that emulate airborne dilution effects. For the scenario of linear dilution, which roughly corresponds to constant ventilation per person as required in many building codes, we see essentially no impact from introducing small hyperedges up to size 15 whereas we do see effects when the hyperedge set is dominated by large hyperedges. Specifically, we track the mean infections after the SIRS epidemic has run for a while so it is in a ?steady state? and find the mean is higher in the large hyperedge regime whereas it is unchanged from pairwise to small hyperedge regime. 
    more » « less