Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available May 13, 2025
-
We provide a processed JSON version of the 3234 page PDF document of Anthony Fauci's emails that were released in 2021 to provide a better understanding of the United States government response to the COVID-19 pandemic. The main JSON file contains a collection of 1289 email threads with 2761 emails among the threads, which includes 101 duplicate emails. For each email, we provide information about the sender, recipients, CC-list, subject, email body text, and email time stamp (when available). We also provide a number of derived datasets stored in individual JSON files: 5 different types of derived email networks, 1 email hypergraph, 1 temporal graph, and 3 tensors. Details for the data conversion process, the construction of the derived datasets, and subsequent analyses can all be found in an online technical report at https://arxiv.org/abs/2108.01239. Updated code for processing and analyzing the data can be found at https://github.com/nveldt/fauci-email.
Research additionally supported by ARO Award W911NF-19-1-0057, ARO MURI, and NSF CAREER Award IIS-2045555, as well as NSF awards CCF-1909528, IIS-2007481, and the Sloan Foundation. -
Modern graph or network datasets often contain rich structure that goes beyond simple pairwise connections between nodes. This calls for complex representations that can capture, for instance, edges of different types as well as so-called “higher-order interactions” that involve more than two nodes at a time. However, we have fewer rigorous methods that can provide insight from such representations. Here, we develop a computational framework for the problem of clustering hypergraphs with categorical edge labels — or different interaction types — where clusters corresponds to groups of nodes that frequently participate in the same type of interaction. Our methodology is based on a combinatorial objective function that is related to correlation clustering on graphs but enables the design of much more efficient algorithms that also seamlessly generalize to hypergraphs. When there are only two label types, our objective can be optimized in polynomial time, using an algorithm based on minimum cuts. Minimizing our objective becomes NP-hard with more than two label types, but we develop fast approximation algorithms based on linear programming relaxations that have theoretical cluster quality guarantees. We demonstrate the efficacy of our algorithms and the scope of the model through problems in edge-label community detection, clustering with temporal data, and exploratory data analysis.more » « less