Existing causal models for link prediction assume an underlying set of inherent node factors—an innate characteristic defined at the node’s birth—that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

pathdependent : the outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for pathdependent link formation, as the cascading functional dependencies between links (arising frompath dependence ) are either unidentifiable or require an impractical number of control variables. To overcome this, we develop the first causal model capable of dealing with path dependencies in link prediction. In this work, we introduce the concept of causal lifting, an invariance in causal models of independent interest that, on graphs, allows the identification of causal link prediction queries using limited interventional data. Further, we show how structural pairwise embeddings exhibit lower bias and correctly represent the task’s causal structure, as opposed to existing node embeddings, e.g. graph neural network node embeddings and matrix factorization. Finally, we validate our theoretical findings on three scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumerproduct recommendations.Free, publiclyaccessible full text available August 1, 2024 
This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs)  such as Graph Neural Networks (GNNs)  to perform inductive outofdistribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove nonasymptotic bounds showing that link predictors based on permutationequivariant (structural) node embeddings obtained by gMPNNs can converge to a random guess as test graphs get larger. We then propose a theoreticallysound gMPNN that outputs structural pairwise (2node) embeddings and prove nonasymptotic bounds showing that, as test graphs grow, these embeddings converge to embeddings of a continuous function that retains its ability to predict links OOD. Empirical results on random graphs show agreement with our theoretical results.more » « less

Deep learning models tend not to be outofdistribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactualinvariant to spurious features, a requirement for outofdistribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactualinvariance if the augmentation is performed by a contextguessing machine, an abstract machine that guesses the mostlikely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a contextguessing machine does not lead to robust OOD classifiers.more » « less

Rolltoroll printing has significantly shortened the time from design to production of sensors and IoT devices, while being costeffective for mass production. But due to less manufacturing tolerance controls available, properties such as sensor thickness, composition, roughness, etc., cannot be precisely controlled. Since these properties likely affect the sensor behavior, rolltoroll printed sensors require validation testing before they can be deployed in the field. In this work, we improve the testing of Nitrate sensors that need to be calibrated in a solution of known Nitrate concentration for around 1–2 days. To accelerate this process, we observe the initial behavior of the sensors for a few hours, and use a physicsinformed machine learning method to predict their measurements 24 hours in the future, thus saving valuable time and testing resources. Due to the variability in rolltoroll printing, this prediction task requires models that are robust to changes in properties of the new test sensors. We show that existing methods fail at this task and describe a physicsinformed machine learning method that improves the prediction robustness to different testing conditions (≈ 1.7× lower in realworld data and ≈ 5× lower in synthetic data when compared with the current stateoftheart physicsinformed machine learning method).more » « less

The performance of Adaptive Bitrate (ABR) algorithms for video streaming depends on accurately predicting the download time of video chunks. Existing prediction approaches (i) assume chunk download times are dominated by network throughput; and (ii) apriori cluster sessions (e.g., based on ISP and CDN) and only learn from sessions in the same cluster. We make three contributions. First, through analysis of data from realworld video streaming sessions, we show (i) apriori clustering prevents learning from related clusters; and (ii) factors such as the Time to First Byte (TTFB) are key components of chunk download times but not easily incorporated into existing prediction approaches. Second, we propose Xatu, a new prediction approach that jointly learns a neural network sequence model with an interpretable automatic session clustering method. Xatu learns clustering rules across all sessions it deems relevant, and models sequences with multiple chunkdependent features (e.g., TTFB) rather than just throughput. Third, evaluations using the above datasets and emulation experiments show that Xatu significantly improves prediction accuracies by 23.8% relative to CS2P (a stateoftheart predictor). We show Xatu provides substantial performance benefits when integrated with multiple ABR algorithms including MPC (a well studied ABR algorithm), and FuguABR (a recent algorithm using stochastic control) relative to their default predictors (CS2P and a fully connected neural network respectively). Further, Xatu combined with MPC outperforms Pensieve, an ABR based on deep reinforcement learning.more » « less

This work considers the general task of estimating the sum of a bounded function over the edges of a graph, given neighborhood query access and where access to the entire network is prohibitively expensive. To estimate this sum, prior work proposes Markov chain Monte Carlo (MCMC) methods that use random walks started at some seed vertex and whose equilibrium distribution is the uniform distribution over all edges, eliminating the need to iterate over all edges. Unfortunately, these existing estimators are not scalable to massive realworld graphs. In this paper, we introduce Ripple, an MCMCbased estimator that achieves unprecedented scalability by stratifying the Markov chain state space into ordered strata with a new technique that we denote sequential stratified regenerations. We show that the Ripple estimator is consistent, highly parallelizable, and scales well. We empirically evaluate our method by applying Ripple to the task of estimating connected, induced subgraph counts given some input graph. Therein, we demonstrate that Ripple is accurate and can estimate counts of up to 12node subgraphs, which is a task at a scale that has been considered unreachable, not only by prior MCMCbased methods but also by other sampling approaches. For instance, in this target application, we present results in which the Markov chain state space is as large as 1043, for which Ripple computes estimates in less than 4 h, on average.more » « less

Generalizing from observed to new related environments (outofdistribution) is central to the reliability of classifiers. However, most classifiers fail to predict label from input when the change in environment is due a (stochastic) input transformation not observed in training, as in training we observe , where is a hidden variable. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well indistribution and outofdistribution. Finally, we show how to learn counterfactuallyinvariant representations with asymmetry learning in two physics tasks.more » « less

Generalizing from observed to new related environments (outofdistribution) is central to the reliability of classifiers. However, most classifiers fail to predict label from input when the change in environment is due a (stochastic) input transformation not observed in training, as in training we observe , where is a hidden variable. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well indistribution and outofdistribution. Finally, we show how to learn counterfactuallyinvariant representations with asymmetry learning in two physics tasks.more » « less