skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Complementary Information and Learning Traps*
Abstract We develop a model of social learning from complementary information: short-lived agents sequentially choose from a large set of flexibly correlated information sources for prediction of an unknown state, and information is passed down across periods. Will the community collectively acquire the best kinds of information? Long-run outcomes fall into one of two cases: (i) efficient information aggregation, where the community eventually learns as fast as possible; (ii) “learning traps,” where the community gets stuck observing suboptimal sources and information aggregation is inefficient. Our main results identify a simple property of the underlying informational complementarities that determines which occurs. In both regimes, we characterize which sources are observed in the long run and how often.  more » « less
Award ID(s):
1851629
PAR ID:
10204277
Author(s) / Creator(s):
;
Date Published:
Journal Name:
The Quarterly Journal of Economics
Volume:
135
Issue:
1
ISSN:
0033-5533
Page Range / eLocation ID:
389 to 448
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We exhibit a natural environment, social learning among heterogeneous agents, where even slight misperceptions can have a large negative impact on long‐run learning outcomes. We consider a population of agents who obtain information about the state of the world both from initial private signals and by observing a random sample of other agents' actions over time, where agents' actions depend not only on their beliefs about the state but also on their idiosyncratic types (e.g., tastes or risk attitudes). When agents are correct about the type distribution in the population, they learn the true state in the long run. By contrast, we show, first, that even arbitrarily small amounts of misperception about the type distribution can generate extreme breakdowns of information aggregation, where in the long run all agents incorrectly assign probability 1 to some fixed state of the world, regardless of the true underlying state. Second, any misperception of the type distribution leads long‐run beliefs and behavior to vary only coarsely with the state, and we provide systematic predictions for how the nature of misperception shapes these coarse long‐run outcomes. Third, we show that how fragile information aggregation is against misperception depends on the richness of agents' payoff‐relevant uncertainty; a design implication is that information aggregation can be improved by simplifying agents' learning environment. The key feature behind our findings is that agents' belief‐updating becomes “decoupled” from the true state over time. We point to other environments where this feature is present and leads to similar fragility results. 
    more » « less
  2. Graph representation learning is crucial for many real-world ap- plications (e.g. social relation analysis). A fundamental problem for graph representation learning is how to effectively learn rep- resentations without human labeling, which is usually costly and time-consuming. Graph contrastive learning (GCL) addresses this problem by pulling the positive node pairs (or similar nodes) closer while pushing the negative node pairs (or dissimilar nodes) apart in the representation space. Despite the success of the existing GCL methods, they primarily sample node pairs based on the node- level proximity yet the community structures have rarely been taken into consideration. As a result, two nodes from the same community might be sampled as a negative pair. We argue that the community information should be considered to identify node pairs in the same communities, where the nodes insides are seman- tically similar. To address this issue, we propose a novel Graph Communal Contrastive Learning (𝑔𝐶𝑜𝑜𝐿) framework to jointly learn the community partition and learn node representations in an end-to-end fashion. Specifically, the proposed 𝑔𝐶𝑜𝑜𝐿 consists of two components: a Dense Community Aggregation (𝐷𝑒𝐶𝐴) algo- rithm for community detection and a Reweighted Self-supervised Cross-contrastive (𝑅𝑒𝑆𝐶) training scheme to utilize the community information. Additionally, the real-world graphs are complex and often consist of multiple views. In this paper, we demonstrate that the proposed 𝑔𝐶𝑜𝑜𝐿 can also be naturally adapted to multiplex graphs. Finally, we comprehensively evaluate the proposed 𝑔𝐶𝑜𝑜𝐿 on a variety of real-world graphs. The experimental results show that the 𝑔𝐶𝑜𝑜𝐿 outperforms the state-of-the-art methods. 
    more » « less
  3. Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The “secure” nature of such aggregation consists of the fact that no information about the local users’ data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users (this is referred to as the key sharing phase). After the key sharing phase, each user masks its trained model which is then sent to the server (this is referred to as the model aggregation phase). This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the K users in the model aggregation phase. Due to user dropouts, which are common in real systems, the server may not receive all messages from the users. A secure aggregation scheme should tolerate the dropouts of at most K – U users, where U is a system parameter. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we add one constraint into the above problem, namely, that the key variables are mutually independent and each key is shared by a group of S users, where S is another system parameter. To the best of our knowledge, all existing secure aggregation schemes (with information theoretic security or computational security) assign coded keys to the users. We show that if S > K–U, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if S ≤ K – U, uncoded groupwise key sharing is strictly sub-optimal. Finally, we also implement our proposed secure aggregation scheme into Amazon EC2, which are then compared with the existing secure aggregation schemes with offline key sharing. 
    more » « less
  4. Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server, without revealing any other information about the local users' data. This paper follows a recent information theoretic secure aggregation problem with user dropouts, where the objective is to characterize the minimum communication cost from the K users to the server during the model aggregation. All existing secure aggregation protocols let the users share and store coded keys to guarantee security. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of practical applications, this paper is the first to consider uncoded groupwise keys, where the keys are mutually independent and each key is shared by a group of S users. We show that if S is beyond a threshold, a new secure aggregation protocol with uncoded groupwise keys, referred to as GroupSecAgg, can achieve the same optimal communication cost as the best protocol with coded keys. The experiments on Amazon EC2 show the considerable improvements on the key sharing and model aggregation times compared to the state-of-the art. 
    more » « less
  5. This paper considers the secure aggregation problem for federated learning under an information theoretic cryptographic formulation, where distributed training nodes (referred to as users) train models based on their own local data and a curious-but-honest server aggregates the trained models without retrieving other information about users’ local data. Secure aggregation generally contains two phases, namely key sharing phase and model aggregation phase. Due to the common effect of user dropouts in federated learning, the model aggregation phase should contain two rounds, where in the first round the users transmit masked models and, in the second round, according to the identity of surviving users after the first round, these surviving users transmit some further messages to help the server decrypt the sum of users’ trained models. The objective of the considered information theoretic formulation is to characterize the capacity region of the communication rates from the users to the server in the two rounds of the model aggregation phase, assuming that key sharing has already been performed offline in prior. In this context, Zhao and Sun completely characterized the capacity region under the assumption that the keys can be arbitrary random variables. More recently, an additional constraint, known as “uncoded groupwise keys,” has been introduced. This constraint entails the presence of multiple independent keys within the system, with each key being shared by precisely S users, where S is a defined system parameter. The capacity region for the information theoretic secure aggregation problem with uncoded groupwise keys was established in our recent work subject to the condition S > K - U, where K is the number of total users and U is the designed minimum number of surviving users (which is another system parameter). In this paper we fully characterize the capacity region for this problem by matching a new converse bound and an achievable scheme. Experimental results over the Tencent Cloud show the improvement on the model aggregation time compared to the original secure aggregation scheme. 
    more » « less