skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00PM ET on Friday, December 15 until 2:00 AM ET on Saturday, December 16 due to maintenance. We apologize for the inconvenience.

Search for: All records

Award ID contains: 1830547

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Recently, there has been a lot of interest in monitoring and identifying changes in dynamic networks, which has led to the development of a variety of monitoring methods. New methods are often designed for a specialized use‐case and rarely compared to competing methods in a systematic fashion. In light of this, the use of simulation is proposed to compare the performance of network monitoring methods over a variety of dynamic network changes. Using the family of simulated dynamic networks, the performance of several state‐of‐the‐art social network monitoring methods from the literature are compared. Their performance over a variety of types of change is compared; both increases in communication levels as well as changes in community structure are considered. It is shown that there does not exist one method that is uniformly superior to the others; the best method depends on the context and the type of change one wishes to detect. As such, it is concluded that a variety of methods are needed for network monitoring and that it is important to understand in which scenarios a given method is appropriate.

    more » « less
  2. Abstract

    Variability in the El Niño‐Southern Oscillation (ENSO) has global impacts on seasonal temperatures and rainfall. Current detection methods for extreme phases, which occur with irregular periodicity, rely upon sea surface temperature anomalies within a strictly defined geographic region of the Pacific Ocean. However, under changing climate conditions and ocean warming, these historically motivated indicators may not be reliable into the future. In this work, we demonstrate the power of data clustering as a robust, automatic way to detect anomalies in climate patterns. Ocean temperature profiles from Argo floats are partitioned into similar groups utilizing unsupervised machine learning methods. The automatically identified groups of measurements represent spatially coherent, large‐scale water masses in the Pacific, despite no inclusion of geospatial information in the clustering task. Further, spatiotemporal dynamics of the clusters are strongly indicative of El Niño events, the east Pacific warming phase of ENSO. The fitting of a cluster model on a collection of ocean profiles identifies changes in the vertical structure of the temperature profiles through reassignment to a different group, concisely capturing physical changes to the water column during an El Niño event, such as thermocline tilting. Clustering proves to be an effective tool for analysis of the irregularly sampled (in space and time) data from Argo floats and may serve as a novel approach for detecting anomalies given the freedom from thresholding decisions. Unsupervised machine learning could be particularly valuable due to its ability to identify patterns in data sets without user‐imposed expectations, facilitating further discovery of anomaly indicators.

    more » « less
  3. Abstract

    In many applications, it is of interest to identify anomalous behavior within a dynamic interacting system. Such anomalous interactions are reflected by structural changes in the network representation of the system. We propose and investigate the use of the degree corrected stochastic block model (DCSBM) to model and monitor dynamic networks that undergo a significant structural change. We apply statistical process monitoring techniques to the estimated parameters of the DCSBM to identify significant structural changes in the network. We apply our surveillance strategy to a dynamic US Senate covoting network. We detect significant changes in the political network that reflect both times of cohesion and times of polarization among Republican and Democratic party members. Our analysis demonstrates that the DCSBM monitoring procedure effectively detects local and global structural changes in complex networks, providing useful insights into the modeled system. The DCSBM approach is an example of a general framework that combines parametric random graph models and statistical process monitoring techniques for network surveillance.

    more » « less
  4. null (Ed.)
    Abstract Population analyses of functional connectivity have provided a rich understanding of how brain function differs across time, individual, and cognitive task. An important but challenging task in such population analyses is the identification of reliable features that describe the function of the brain, while accounting for individual heterogeneity. Our work is motivated by two particularly important challenges in this area: first, how can one analyze functional connectivity data over populations of individuals, and second, how can one use these analyses to infer group similarities and differences. Motivated by these challenges, we model population connectivity data as a multilayer network and develop the multi-node2vec algorithm, an efficient and scalable embedding method that automatically learns continuous node feature representations from multilayer networks. We use multi-node2vec to analyze resting state fMRI scans over a group of 74 healthy individuals and 60 patients with schizophrenia. We demonstrate how multilayer network embeddings can be used to visualize, cluster, and classify functional regions of the brain for these individuals. We furthermore compare the multilayer network embeddings of the two groups. We identify significant differences between the groups in the default mode network and salience network—findings that are supported by the triple network model theory of cognitive organization. Our findings reveal that multi-node2vec is a powerful and reliable method for analyzing multilayer networks. Data and publicly available code are available at . 
    more » « less
  5. null (Ed.)
    Abstract Across the social sciences, scholars regularly pool effects over substantial periods of time, a practice that produces faulty inferences if the underlying data generating process is dynamic. To help researchers better perform principled analyses of time-varying processes, we develop a two-stage procedure based upon techniques for permutation testing and statistical process monitoring. Given time series cross-sectional data, we break the role of time through permutation inference and produce a null distribution that reflects a time-invariant data generating process. The null distribution then serves as a stable reference point, enabling the detection of effect changepoints. In Monte Carlo simulations, our randomization technique outperforms alternatives for changepoint analysis. A particular benefit of our method is that, by establishing the bounds for time-invariant effects before interacting with actual estimates, it is able to differentiate stochastic fluctuations from genuine changes. We demonstrate the method’s utility by applying it to a popular study on the relationship between alliances and the initiation of militarized interstate disputes. The example illustrates how the technique can help researchers make inferences about where changes occur in dynamic relationships and ask important questions about such changes. 
    more » « less
  6. null (Ed.)
  7. null (Ed.)
    In many application settings involving networks, such as messages between users of an on-line social network or transactions between traders in financial markets, the observed data consist of timestamped relational events, which form a continuous-time network. We propose the Community Hawkes Independent Pairs (CHIP) generative model for such networks. We show that applying spectral clustering to an aggregated adjacency matrix constructed from the CHIP model provides consistent community detection for a growing number of nodes and time duration. We also develop consistent and computationally efficient estimators for the model parameters. We demonstrate that our proposed CHIP model and estimation procedure scales to large networks with tens of thousands of nodes and provides superior fits than existing continuous-time network models on several real networks. 
    more » « less