skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Causal inference in genetic trio studies
We introduce a method to draw causal inferences—inferences immune to all possible confounding—from genetic data that include parents and offspring. Causal conclusions are possible with these data because the natural randomness in meiosis can be viewed as a high-dimensional randomized experiment. We make this observation actionable by developing a conditional independence test that identifies regions of the genome containing distinct causal variants. The proposed digital twin test compares an observed offspring to carefully constructed synthetic offspring from the same parents to determine statistical significance, and it can leverage any black-box multivariate model and additional nontrio genetic data to increase power. Crucially, our inferences are based only on a well-established mathematical model of recombination and make no assumptions about the relationship between the genotypes and phenotypes. We compare our method to the widely used transmission disequilibrium test and demonstrate enhanced power and localization.  more » « less
Award ID(s):
1712800
PAR ID:
10193080
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Proceedings of the National Academy of Sciences
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
117
Issue:
39
ISSN:
0027-8424
Page Range / eLocation ID:
p. 24117-24126
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Analysts often make visual causal inferences about possible data-generating models. However, visual analytics (VA) software tends to leave these models implicit in the mind of the analyst, which casts doubt on the statistical validity of informal visual “insights”. We formally evaluate the quality of causal inferences from visualizations by adopting causal support—a Bayesian cognition model that learns the probability of alternative causal explanations given some data—as a normative benchmark for causal inferences. We contribute two experiments assessing how well crowdworkers can detect (1) a treatment effect and (2) a confounding relationship. We find that chart users’ causal inferences tend to be insensitive to sample size such that they deviate from our normative benchmark. While interactively cross-filtering data in visualizations can improve sensitivity, on average users do not perform reliably better with common visualizations than they do with textual contingency tables. These experiments demonstrate the utility of causal support as an evaluation framework for inferences in VA and point to opportunities to make analysts’ mental models more explicit in VA software. 
    more » « less
  2. Fitch, T.; Lamm, C.; Leder, H.; Teßmar-Raible, K. (Ed.)
    Listening to music activates representations of movement and social agents. Why? We ask whether high-level causal reasoning about how music was generated can lead people to link musical sounds with animate agents. To test this, we asked whether people (N=60) make flexible inferences about whether an agent caused musical sounds, integrating information from the sounds’ timing and from the visual context in which it was produced. Using a 2x2 within-subject design, we found evidence of causal reasoning: In a context where producing a musical sequence would require self-propelled movement, people inferred that an agent had been present causing the sounds. When the context provided an alternative possible explanation, this ‘explained away’ the agent, reducing the tendency to infer an agent was present for the same acoustic stimuli. People can use causal reasoning to infer whether an agent produced musical sounds, suggesting that high-level cognition can link music with social concepts. 
    more » « less
  3. Smiseth, Per (Ed.)
    Abstract Asymmetries in power (the ability to influence the outcome of conflict) are ubiquitous in social interactions because interacting individuals are rarely identical. It is well documented that asymmetries in power influence the outcome of reproductive conflict in social groups. Yet power asymmetries have received little attention in the context of negotiations between caring parents, which is surprising given that parents are often markedly different in size. Here we built on an existing negotiation model to examine how power and punishment influence negotiations over care. We incorporated power asymmetry by allowing the more-powerful parent, rank 1, to inflict punishment on the less-powerful parent, rank 2. We then determined when punishment will be favored by selection and how it would affect the negotiated behavioral response of each parent. We found that with power and punishment, a reduction in one parent’s effort results in partial compensation by the other parent. However, the degree of compensation is asymmetric: the rank 2 compensates more than the rank 1. As a result, the fitness of rank 1 increases and the fitness of rank 2 decreases, relative to the original negotiation model. Furthermore, because power and punishment enable one parent to extract greater effort from the other, offspring can do better, that is, receive more total effort, when there is power and punishment involved in negotiations over care. These results reveal how power and punishment alter the outcome of conflict between parents and affect offspring, providing insights into the evolutionary consequences of exerting power in negotiations. 
    more » « less
  4. Identifying cause-effect relations among variables is a key step in the decision-making process. Whereas causal inference requires randomized experiments, researchers and policy makers are increasingly using observational studies to test causal hypotheses due to the wide availability of data and the infeasibility of experiments. The matching method is the most used technique to make causal inference from observational data. However, the pair assignment process in one-to-one matching creates uncertainty in the inference because of different choices made by the experimenter. Recently, discrete optimization models have been proposed to tackle such uncertainty; however, they produce 0-1 nonlinear problems and lack scalability. In this work, we investigate this emerging data science problem and develop a unique computational framework to solve the robust causal inference test instances from observational data with continuous outcomes. In the proposed framework, we first reformulate the nonlinear binary optimization problems as feasibility problems. By leveraging the structure of the feasibility formulation, we develop greedy schemes that are efficient in solving robust test problems. In many cases, the proposed algorithms achieve a globally optimal solution. We perform experiments on real-world data sets to demonstrate the effectiveness of the proposed algorithms and compare our results with the state-of-the-art solver. Our experiments show that the proposed algorithms significantly outperform the exact method in terms of computation time while achieving the same conclusion for causal tests. Both numerical experiments and complexity analysis demonstrate that the proposed algorithms ensure the scalability required for harnessing the power of big data in the decision-making process. Finally, the proposed framework not only facilitates robust decision making through big-data causal inference, but it can also be utilized in developing efficient algorithms for other nonlinear optimization problems such as quadratic assignment problems. History: Accepted by Ram Ramesh, Area Editor for Data Science and Machine Learning. Funding: This work was supported by the Division of Civil, Mechanical and Manufacturing Innovation of the National Science Foundation [Grant 2047094]. Supplemental Material: The online supplements are available at https://doi.org/10.1287/ijoc.2022.1226 . 
    more » « less
  5. Abstract In the statistical analysis of genome-wide association data, it is challenging to precisely localize the variants that affect complex traits, due to linkage disequilibrium, and to maximize power while limiting spurious findings. Here we report onKnockoffZoom: a flexible method that localizes causal variants at multiple resolutions by testing the conditional associations of genetic segments of decreasing width, while provably controlling the false discovery rate. Our method utilizes artificial genotypes as negative controls and is equally valid for quantitative and binary phenotypes, without requiring any assumptions about their genetic architectures. Instead, we rely on well-established genetic models of linkage disequilibrium. We demonstrate that our method can detect more associations than mixed effects models and achieve fine-mapping precision, at comparable computational cost. Lastly, we applyKnockoffZoomto data from 350k subjects in the UK Biobank and report many new findings. 
    more » « less