NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Subcritical Connectivity and Some Exact Tail Exponents in High Dimensional Percolation

https://doi.org/10.1007/s00220-023-04759-w

Chatterjee, Shirshendu; Hanson, Jack; Sosoe, Philippe (August 2023, Communications in Mathematical Physics)

Full Text Available
On the Estimation of the Number of Communities for Sparse Networks

https://doi.org/10.1080/01621459.2023.2223793

Hwang, Neil; Xu, Jiarui; Chatterjee, Shirshendu; Bhattacharyya, Sharmodeep (August 2023, Journal of the American Statistical Association)

Full Text Available
Detection of Temporal Shifts in Semantics Using Local Graph Clustering

https://doi.org/10.3390/make5010008

Hwang, Neil; Chatterjee, Shirshendu; Di, Yanming; Bhattacharyya, Sharmodeep (March 2023, Machine Learning and Knowledge Extraction)

Many changes in our digital corpus have been brought about by the interplay between rapid advances in digital communication and the current environment characterized by pandemics, political polarization, and social unrest. One such change is the pace with which new words enter the mass vocabulary and the frequency at which meanings, perceptions, and interpretations of existing expressions change. The current state-of-the-art algorithms do not allow for an intuitive and rigorous detection of these changes in word meanings over time. We propose a dynamic graph-theoretic approach to inferring the semantics of words and phrases (“terms”) and detecting temporal shifts. Our approach represents each term as a stochastic time-evolving set of contextual words and is a count-based distributional semantic model in nature. We use local clustering techniques to assess the structural changes in a given word’s contextual words. We demonstrate the efficacy of our method by investigating the changes in the semantics of the phrase “Chinavirus”. We conclude that the term took on a much more pejorative meaning when the White House used the term in the second half of March 2020, although the effect appears to have been temporary. We make both the dataset and the code used to generate this paper’s results available.
more » « less
Full Text Available
Getting Local and Personal: Toward Building a Predictive Model for COVID in Three United States Cities.

Edwards, April; Metcalf, Leigh; Casey, William A; Chatterjee, Shirshendu; Janwa, Heeralal; Battifarano, Ernest (May 2023, Springer)
Kacprzyk, Janusz; Pal, Nikhil R; Perez, Rafael B; Corchado, Emilio S; Hagras, Hani; Kóczy, László T; Kreinovich, Vladik; Lin, Chin-Teng; Lu, Jie; Melin, Patricia (Ed.)
The COVID-19 pandemic was lived in real-time on social media. In the current project, we use machine learning to explore the relationship between COVID-19 cases and social media activity on Twitter. We were particularly interested in determining if Twitter activity can be used to predict COVID-19 surges. We also were interested in exploring features of social media, such as replies, to determine their promise for understanding the views of individual users. With the prevalence of mis/disinformation on social media, it is critical to develop a deeper and richer understanding of the relationship between social media and real-world events in order to detect and prevent future influence operations. In the current work, we explore the relationship between COVID-19 cases and social media activity (on Twitter) in three major United States cities with different geographical and political landscapes. We find that Twitter activity resulted in statistically significant correlations using the Granger causality test, with a lag of one week in all three cities. Similarly, the use of replies, which appear more likely to be generated by individual users, not bots or public relations operations, was also strongly correlated with the number of COVID-19 cases using the Granger causality test. Furthermore, we were able to build promising predictive models for the number of future COVID-19 cases using correlation data to select features for input to our models. In contrast, significant correlations were not identified when comparing the number of COVID-19 cases with mainstream media sources or with a sample of all US COVID-related tweets. We conclude that, even for an international event such as COVID-19, social media tracks closely with local conditions. We also suggest that replies can be a valuable feature within a machine learning task that is attempting to gauge the reactions of individual users.
more » « less
Full Text Available
Observational Study of the Effect of the Juvenile Stay-At-Home Order on SARS-CoV-2 Infection Spread in Saline County, Arkansas

https://doi.org/10.1080/2330443X.2022.2050326

Hwang, Neil; Chatterjee, Shirshendu; Di, Yanming; Bhattacharyya, Sharmodeep (December 2022, Statistics and Public Policy)

Full Text Available
Changes over Time in Association Patterns between Estimated COVID-19 Case Fatality Rates and Demographic, Socioeconomic and Health Factors in the US States of Florida and New York

https://doi.org/10.3390/covid2100102

Joshi, Mansi; Di, Yanming; Bhattacharyya, Sharmodeep; Chatterjee, Shirshendu (October 2022, COVID)

The United States struggled exceptionally during the COVID-19 pandemic. For researchers and policymakers, it is of great interest to understand the risk factors associated with COVID-19 when examining data aggregated at a regional level. We examined the county-level association between the reported COVID-19 case fatality rate (CFR) and various demographic, socioeconomic and health factors in two hard-hit US states: New York and Florida. In particular, we examined the changes over time in the association patterns. For each state, we divided the data into three seasonal phases based on observed waves of the COVID-19 outbreak. For each phase, we used tests of correlations to explore the marginal association between each potential covariate and the reported CFR. We used graphical models to further clarify direct or indirect associations in a multivariate setting. We found that during the early phase of the pandemic, the association patterns were complex: the reported CFRs were high, with great variation among counties. As pandemics progressed, especially during the winter phase, socioeconomic factors such as median household income and health-related factors such as the prevalence of adult smokers and mortality rate of respiratory diseases became more significantly associated with the CFR. It is remarkable that common risk factors were identified for both states.
more » « less
Full Text Available
The effect of avoiding known infected neighbors on the persistence of a recurring infection process

https://doi.org/10.1214/22-EJP836

Chatterjee, Shirshendu; Sivakoff, David; Wascher, Matthew (January 2022, Electronic Journal of Probability)

Full Text Available
A General Framework for Spatio-Temporal Modeling of Epidemics With Multiple Epicenters: Application to an Aerially Dispersed Plant Pathogen

https://doi.org/10.3389/fams.2021.721352

Ojwang', Awino M.; Ruiz, Trevor; Bhattacharyya, Sharmodeep; Chatterjee, Shirshendu; Ojiambo, Peter S.; Gent, David H. (November 2021, Frontiers in Applied Mathematics and Statistics)

The spread dynamics of long-distance-dispersed pathogens are influenced by the dispersal characteristics of a pathogen, anisotropy due to multiple factors, and the presence of multiple sources of inoculum. In this research, we developed a flexible class of phenomenological spatio-temporal models that extend a modeling framework used in plant pathology applications to account for the presence of multiple sources and anisotropy of biological species that can govern disease gradients and spatial spread in time. We use the cucurbit downy mildew pathosystem (caused by Pseudoperonospora cubensis ) to formulate a data-driven procedure based on the 2008 to 2010 historical occurrence of the disease in the U.S. available from standardized sentinel plots deployed as part of the Cucurbit Downy Mildew ipmPIPE program. This pathosystem is characterized by annual recolonization and extinction cycles, generating annual disease invasions at the continental scale. This data-driven procedure is amenable to fitting models of disease spread from one or multiple sources of primary inoculum and can be specified to provide estimates of the parameters by regression methods conditional on a function that can accommodate anisotropy in disease occurrence data. Applying this modeling framework to the cucurbit downy mildew data sets, we found a small but consistent reduction in temporal prediction errors by incorporating anisotropy in disease spread. Further, we did not find evidence of an annually occurring, alternative source of P. cubensis in northern latitudes. However, we found a signal indicating an alternative inoculum source on the western edge of the Gulf of Mexico. This modeling framework is tractable for estimating the generalized location and velocity of a disease front from sparsely sampled data with minimal data acquisition costs. These attributes make this framework applicable and useful for a broad range of ecological data sets where multiple sources of disease may exist and whose subsequent spread is directional.
more » « less
Full Text Available
Restricted Percolation Critical Exponents in High Dimensions

https://doi.org/10.1002/cpa.21938

Chatterjee, Shirshendu; Hanson, Jack (November 2020, Communications on Pure and Applied Mathematics)
null (Ed.)
Full Text Available

Search for: All records