skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Adaptive Shrinkage Estimation for Streaming Graphs
Networks are a natural representation of complex systems across the sciences, and higher-order dependencies are central to the understanding and modeling of these systems. However, in many practical applications such as online social networks, networks are massive, dynamic, and naturally streaming, where pairwise interactions among vertices become available one at a time in some arbitrary order. The massive size and streaming nature of these networks allow only partial observation, since it is infeasible to analyze the entire network. Under such scenarios, it is challenging to study the higher-order structural and connectivity patterns of streaming networks. In this work, we consider the fundamental problem of estimating the higher-order dependencies using adaptive sampling. We propose a novel adaptive, single-pass sampling framework and unbiased estimators for higher-order network analysis of large streaming networks. Our algorithms exploit adaptive techniques to identify edges that are highly informative for efficiently estimating the higher-order structure of streaming networks from small sample data. We also introduce a novel James-Stein shrinkage estimator to reduce the estimation error. Our approach is fully analytic, computationally efficient, and can be incrementally updated in a streaming setting. Numerical experiments on large networks show that our approach is superior to baseline methods.  more » « less
Award ID(s):
1934904 1848596 1839816
PAR ID:
10209361
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
33
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Although Software-Defined Wide Area Networks (SD-WANs) are now widely deployed in several production networks, they are largely restricted to traffic engineering ap- proaches based on layer 4 (L4) of the network protocol stack. Such approaches result in improved Quality-of-Service (QoS) of the network overall without necessarily focussing on the requirements of a specific application. However, the emergence of application protocols such as QUIC and HTTP/2 needs an in- vestigation of layer 5-based (L5) approaches in order to improve users’ Quality-of-Experience (QoE). In this paper, we leverage the capabilities of flexible, P4-based switches that incorporate protocol-independent packet processing in order to intelligently route traffic based on application headers. We use Adaptive Bit Rate (ABR) video streaming as an example to show how such an approach can not only provide flexible traffic management but also improve application QoE. Our evaluation consists of an actual deployment in a research testbed, Chameleon, where we leverage the benefits of fast paths in order to retransmit video segments in higher qualities. Further, we analyze real-world ABR streaming sessions from a large-scale CDN and show that our approach can successfully maximize QoE for all users in the dataset. 
    more » « less
  2. Complex systems, represented as dynamic networks, comprise of components that influence each other via direct and/or indirect interactions. Recent research has shown the importance of using Higher-Order Networks (HONs) for modeling and analyzing such complex systems, as the typical Markovian assumption in developing the First Order Network (FON) can be limiting. This higher-order network representation not only creates a more accurate representation of the underlying complex system, but also leads to more accurate network analysis. In this paper, we first present a scalable and accurate model, BuildHON+, for higher-order network representation of data derived from a complex system with various orders of dependencies. Then, we show that this higher-order network representation modeled by BuildHON+ is significantly more accurate in identifying anomalies than FON, demonstrating a need for the higher-order network representatio 
    more » « less
  3. Abstract Plants and their insect herbivores have been a dominant component of the terrestrial ecological landscape for the past 410 million years and feature intricate evolutionary patterns and co‐dependencies. A complex systems perspective allows for both detailed resolution of these evolutionary relationships as well as comparison and synthesis across systems. Using proxy data of insect herbivore damage (denoted by the damage type or DT) preserved on fossil leaves, functional bipartite network representations provide insights into how plant–insect associations depend on geological time, paleogeographical space, and environmental variables such as temperature and precipitation. However, the metrics measured from such networks are prone to sampling bias. Such sensitivity is of special concern for plant–DT association networks in paleontological settings where sampling effort is often severely limited. Here, we explore the sensitivity of functional bipartite network metrics to sampling intensity and identify sampling thresholds above which metrics appear robust to sampling effort. Across a broad range of sampling efforts, we find network metrics to be less affected by sampling bias and/or sample size than richness metrics, which are routinely used in studies of fossil plant–DT interactions. These results provide reassurance that cross‐comparisons of plant–DT networks offer insights into network structure and function and support their widespread use in paleoecology. Moreover, these findings suggest novel opportunities for using plant–DT networks in neontological terrestrial ecology to understand functional aspects of insect herbivory across geological time, environmental perturbations, and geographic space. 
    more » « less
  4. Networked systems that occur in various domains, such as electric networks, the brain, and opinion networks, are known to obey conservation laws. For instance, electric networks obey Kirchoff’s laws, and social networks obey opinion consensus. Conservation laws are often modeled as balance equations that relate appropriate injected flows and potentials at the nodes of the networks. A recent line of work considers the problem of estimating the unknown structure of such networked systems from observations of node potentials (and only the knowledge of the statistics of injected flows). Given the dynamic nature of the systems under consideration, an equally important task is estimating the change in the structure of the network from data – the so called differential network analysis problem. That is, given two sets of node potential observations, the goal is to estimate the structural differences between the underlying networks. We formulate this novel differential network analysis problem for systems obeying conservation laws and devise a convex estimator to learn the edge changes directly from node potentials. We derive conditions under which the estimate is unique in the high-dimensional regime and devise an efficient ADMM-based approach to perform the estimation. Finally, we demonstrate the performance of our approach on synthetic and benchmark power network data. 
    more » « less
  5. The monitoring of data streams with a network structure have drawn increasing attention due to its wide applications in modern process control. In these applications, high-dimensional sensor nodes are interconnected with an underlying network topology. In such a case, abnormalities occurring to any node may propagate dynamically across the network and cause changes of other nodes over time. Furthermore, high dimensionality of such data significantly increased the cost of resources for data transmission and computation, such that only partial observations can be transmitted or processed in practice. Overall, how to quickly detect abnormalities in such large networks with resource constraints remains a challenge, especially due to the sampling uncertainty under the dynamic anomaly occurrences and network-based patterns. In this paper, we incorporate network structure information into the monitoring and adaptive sampling methodologies for quick anomaly detection in large networks where only partial observations are available. We develop a general monitoring and adaptive sampling method and further extend it to the case with memory constraints, both of which exploit network distance and centrality information for better process monitoring and identification of abnormalities. Theoretical investigations of the proposed methods demonstrate their sampling efficiency on balancing between exploration and exploitation, as well as the detection performance guarantee. Numerical simulations and a case study on power network have demonstrated the superiority of the proposed methods in detecting various types of shifts. Note to Practitioners —Continuous monitoring of networks for anomalous events is critical for a large number of applications involving power networks, computer networks, epidemiological surveillance, social networks, etc. This paper aims at addressing the challenges in monitoring large networks in cases where monitoring resources are limited such that only a subset of nodes in the network is observable. Specifically, we integrate network structure information of nodes for constructing sequential detection methods via effective data augmentation, and for designing adaptive sampling algorithms to observe suspicious nodes that are likely to be abnormal. Then, the method is further generalized to the case that the memory of the computation is also constrained due to the network size. The developed method is greatly beneficial and effective for various anomaly patterns, especially when the initial anomaly randomly occurs to nodes in the network. The proposed methods are demonstrated to be capable of quickly detecting changes in the network and dynamically changes the sampling priority based on online observations in various cases, as shown in the theoretical investigation, simulations and case studies. 
    more » « less