skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM to 12:00 AM ET on Tuesday, March 25 due to maintenance. We apologize for the inconvenience.


Title: Data and incentives
“Big data” gives markets access to previously unmeasured characteristics of individual agents. Policymakers must decide whether and how to regulate the use of this data. We study how new data affects incentives for agents to exert effort in settings such as the labor market, where an agent's quality is initially unknown but is forecast from an observable outcome. We show that measurement of a new covariate has a systematic effect on the average effort exerted by agents, with the direction of the effect determined by whether the covariate is informative about long‐run quality versus a shock to short‐run outcomes. For a class of covariates satisfying a statistical property that we callstrong homoskedasticity, this effect is uniform across agents. More generally, new measurements can impact agents unequally, and we show that these distributional effects have a first‐order impact on social welfare.  more » « less
Award ID(s):
1851629
PAR ID:
10561642
Author(s) / Creator(s):
;
Publisher / Repository:
Econometric Society
Date Published:
Journal Name:
Theoretical Economics
Volume:
19
Issue:
1
ISSN:
1933-6837
Page Range / eLocation ID:
407 to 448
Subject(s) / Keyword(s):
Big data, forecasting, effort incentives, career concerns
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We exhibit a natural environment, social learning among heterogeneous agents, where even slight misperceptions can have a large negative impact on long‐run learning outcomes. We consider a population of agents who obtain information about the state of the world both from initial private signals and by observing a random sample of other agents' actions over time, where agents' actions depend not only on their beliefs about the state but also on their idiosyncratic types (e.g., tastes or risk attitudes). When agents are correct about the type distribution in the population, they learn the true state in the long run. By contrast, we show, first, that even arbitrarily small amounts of misperception about the type distribution can generate extreme breakdowns of information aggregation, where in the long run all agents incorrectly assign probability 1 to some fixed state of the world, regardless of the true underlying state. Second, any misperception of the type distribution leads long‐run beliefs and behavior to vary only coarsely with the state, and we provide systematic predictions for how the nature of misperception shapes these coarse long‐run outcomes. Third, we show that how fragile information aggregation is against misperception depends on the richness of agents' payoff‐relevant uncertainty; a design implication is that information aggregation can be improved by simplifying agents' learning environment. The key feature behind our findings is that agents' belief‐updating becomes “decoupled” from the true state over time. We point to other environments where this feature is present and leads to similar fragility results. 
    more » « less
  2. Abstract Fisheries are often characterized by high heterogeneity in the spatial distribution of habitat quality, as well as fishing effort. However, in several fisheries, the objective of achieving a sustainable yield is addressed by limiting Total Allowable Catch (TAC), set as a fraction of the overall population, regardless of the population's spatial distribution and of fishing effort. Here, we use an integral projection model to investigate how stock abundance and catch in the green abalone fishery in Isla Natividad, Mexico, are affected by the interaction of heterogeneity in habitat quality and fishing effort, and whether these interactions change with Allee effects—reproductive failure in a low-density population. We found that high-quality areas are under-exploited when fishing pressure is homogeneous but habitat is heterogeneous. However, this leads to different fishery outcomes depending on the stock's exploitation status, namely: sub-optimal exploitation when the TAC is set to maximum sustainable yield, and stability against collapses when the fishery is overexploited. Concentration of fishing effort in productive areas can compensate for this effect, which, similarly, has opposite consequences in both scenarios: fishery performance increases if the TAC is sustainable but decreases in overexploited fisheries. These results only hold when Allee effects are included. 
    more » « less
  3. We consider the crowdsourcing setting where, in response to the assigned tasks, agents strategically decide both how much effort to exert (from a continuum) and whether to manipulate their reports. The goal is to design payment mechanisms that (1) satisfy limited liability (all payments are non-negative), (2) reduce the principal’s cost of budget, (3) incentivize effort and (4) incentivize truthful responses. In our framework, the payment mechanism composes a performance measurement, which noisily evaluates agents’ effort based on their reports, and a payment function, which converts the scores output by the performance measurement to payments. Previous literature suggests applying a peer prediction mechanism combined with a linear payment function. This method can achieve either (1), (3) and (4), or (2), (3) and (4) in the binary effort setting. In this paper, we suggest using a rank-order payment function (tournament). Assuming Gaussian noise, we analytically optimize the rank-order payment function, and identify a sufficient statistic, sensitivity, which serves as a metric for optimizing the performance measurements. This helps us obtain (1), (2) and (3) simultaneously. Additionally, we show that adding noise to agents’ scores can preserve the truthfulness of the performance measurements under the non-linear tournament, which gives us all four objectives. Our real-data estimated agent-based model experiments show that our method can greatly reduce the payment of effort elicitation while preserving the truthfulness of the performance measurement. In addition, we empirically evaluate several commonly used performance measurements in terms of their sensitivities and strategic robustness. 
    more » « less
  4. Abstract BackgroundEpigenomic profiling assays such as ChIP-seq have been widely used to map the genome-wide enrichment profiles of chromatin-associated proteins and posttranslational histone modifications. Sequencing depth is a key parameter in experimental design and quality control. However, due to variable sequencing depth requirements across experimental conditions, it can be challenging to determine optimal sequencing depth, particularly for projects involving multiple targets or cell types. ResultsWe developed thepeaksatR package to provide target read depth estimates for epigenomic experiments based on the analysis of peak saturation curves. We appliedpeaksatto establish the distinctive read depth requirements for ChIP-seq studies of histone modifications in different cell lines. Usingpeaksat,we were able to estimate the target read depth required per library to obtain high-quality peak calls for downstream analysis. In addition,peaksatwas applied to other sequence-enrichment methods including CUT&RUN and ATAC-seq. Conclusionpeaksataddresses a need for researchers to make informed decisions about whether their sequencing data has been generated to an adequate depth and subsequently sufficient meaningful peaks, and failing that, how many more reads would be required per library.peaksatis applicable to other sequence-based methods that include calling peaks in their analysis. 
    more » « less
  5. Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially. 
    more » « less