skip to main content


Title: Uncertainty over Uncertainty: Investigating the Assumptions, Annotations, and Text Measurements of Economic Policy Uncertainty
Methods and applications are inextricably linked in science, and in particular in the domain of text-as-data. In this paper, we examine one such text-as-data application, an established economic index that measures economic policy uncertainty from keyword occurrences in news. This index, which is shown to correlate with firm investment, employment, and excess market returns, has had substantive impact in both the private sector and academia. Yet, as we revisit and extend the original authors’ annotations and text measurements we find interesting text-as-data methodological research questions: (1) Are annotator disagreements a reflection of ambiguity in language? (2) Do alternative text measurements correlate with one another and with measures of external predictive validity? We find for this application (1) some annotator disagreements of economic policy uncertainty can be attributed to ambiguity in language, and (2) switching measurements from keyword-matching to supervised machine learning classifiers results in low correlation, a concerning implication for the validity of the index.  more » « less
Award ID(s):
1845576
NSF-PAR ID:
10281911
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science
Page Range / eLocation ID:
116-131
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Discourse parsing has proven to be useful for a number of NLP tasks that require complex reasoning. However, over a decade since the advent of the Penn Discourse Treebank, predicting implicit discourse relations in text remains challenging. There are several possible reasons for this, and we hypothesize that models should be exposed to more context as it plays an important role in accurate human annotation; meanwhile adding uncertainty measures can improve model accuracy and calibration. To thoroughly investigate this phenomenon, we perform a series of experiments to determine 1) the effects of context on human judgments, and 2) the effect of quantifying uncertainty with annotator confidence ratings on model accuracy and calibration (which we measure using the Brier score (Brier et al, 1950)). We find that including annotator accuracy and confidence improves model accuracy, and incorporating confidence in the model’s temperature function can lead to models with significantly better-calibrated confidence measures. We also find some insightful qualitative results regarding human and model behavior on these datasets. 
    more » « less
  2. When groups of people are tasked with making a judgment, the issue of uncertainty often arises. Existing methods to reduce uncertainty typically focus on iteratively improving specificity in the overall task instruction. However, uncertainty can arise from multiple sources, such as ambiguity of the item being judged due to limited context, or disagreements among the participants due to different perspectives and an under-specified task. A one-size-fits-all intervention may be ineffective if it is not targeted to the right source of uncertainty. In this paper we introduce a new workflow, Judgment Sieve, to reduce uncertainty in tasks involving group judgment in a targeted manner. By utilizing measurements that separate different sources of uncertainty during an initial round of judgment elicitation, we can then select a targeted intervention adding context or deliberation to most effectively reduce uncertainty on each item being judged. We test our approach on two tasks: rating word pair similarity and toxicity of online comments, showing that targeted interventions reduced uncertainty for the most uncertain cases. In the top 10% of cases, we saw an ambiguity reduction of 21.4% and 25.7%, and a disagreement reduction of 22.2% and 11.2% for the two tasks respectively. We also found through a simulation that our targeted approach reduced the average uncertainty scores for both sources of uncertainty as opposed to uniform approaches where reductions in average uncertainty from one source came with an increase for the other. 
    more » « less
  3. null (Ed.)
    Introduction Tuning of lower-limb (LL) robotic prosthesis control is necessary to provide personalised assistance to each human wearer during walking. Prostheses wearers’ adaptation processes are subjective and the efficiency largely depends on one’s mental processes. Therefore, beyond physical motor performance, prosthesis personalisation should consider the wearer’s preference and cognitive performance during walking. As a first step, it is necessary to examine the current measures of cognitive performance when a wearer walks with an LL prosthesis, identify the gaps and methodological considerations, and explore additional measures in a walking setting. In this protocol, we outlined a scoping review that will systematically summarise and evaluate the measures of cognitive performance during walking with and without LL prosthesis. Methods and analysis The review process will be guided and documented by CADIMA, an open-access online data management portal for evidence synthesis. Keyword searches will be conducted in seven databases (Web of Science, MEDLINE, BIOSIS, SciELO Citation Index, ProQuest, CINAHL and PsycINFO) up to 2020 supplemented with grey literature searches. Retrieved records will be screened by at least two independent reviewers on the title-and-abstract level and then the full-text level. Selected studies will be evaluated for reporting bias. Data on sample characteristics, type of cognitive function, characteristics of cognitive measures, task prioritisation, experimental design and walking setting will be extracted. Ethics and dissemination This scoping review will evaluate the measures used in previously published studies thus does not require ethical approval. The results will contribute to the advancement of prosthesis tuning processes by reviewing the application status of cognitive measures during walking with and without prosthesis and laying the foundation for developing needed measures for cognitive assessment during walking. The results will be disseminated through conferences and journals. 
    more » « less
  4. Abstract

    Uncertainty quantification (UQ) in metal additive manufacturing (AM) has attracted tremendous interest in order to dramatically improve product reliability. Model-based UQ, which relies on the validity of a computational model, has been widely explored as a potential substitute for the time-consuming and expensive UQ solely based on experiments. However, its adoption in the practical AM process requires overcoming two main challenges: (1) the inaccurate knowledge of uncertainty sources and (2) the intrinsic uncertainty associated with the computational model. Here, we propose a data-driven framework to tackle these two challenges by combining high throughput physical/surrogate model simulations and the AM-Bench experimental data from the National Institute of Standards and Technology (NIST). We first construct a surrogate model, based on high throughput physical simulations, for predicting the three-dimensional (3D) melt pool geometry and its uncertainty with respect to AM parameters and uncertainty sources. We then employ a sequential Bayesian calibration method to perform experimental parameter calibration and model correction to significantly improve the validity of the 3D melt pool surrogate model. The application of the calibrated melt pool model to UQ of the porosity level, an important quality factor, of AM parts, demonstrates its potential use in AM quality control. The proposed UQ framework can be generally applicable to different AM processes, representing a significant advance toward physics-based quality control of AM products.

     
    more » « less
  5. Body condition is a crucial and indicative measure of an animal’s fitness, reflecting overall foraging success, habitat quality, and balance between energy intake and energetic investment toward growth, maintenance, and reproduction. Recently, drone-based photogrammetry has provided new opportunities to obtain body condition estimates of baleen whales in one, two or three dimensions (1D, 2D, and 3D, respectively) – a single width, a projected dorsal surface area, or a body volume measure, respectively. However, no study to date has yet compared variation among these methods and described how measurement uncertainty scales across these dimensions. This associated uncertainty may affect inference derived from these measurements, which can lead to misinterpretation of data, and lack of comparison across body condition measurements restricts comparison of results between studies. Here we develop a Bayesian statistical model using known-sized calibration objects to predict the length and width measurements of unknown-sized objects (e.g., a whale). We use the fitted model to predict and compare uncertainty associated with 1D, 2D, and 3D photogrammetry-based body condition measurements of blue, humpback, and Antarctic minke whales – three species of baleen whales with a range of body sizes. The model outputs a posterior predictive distribution of body condition measurements and allows for the construction of highest posterior density intervals to define measurement uncertainty. We find that uncertainty does not scale linearly across multi-dimensional measurements, with 2D and 3D uncertainty increasing by a factor of 1.45 and 1.76 compared to 1D, respectively. Each standardized body condition measurement is highly correlated with one another, yet 2D body area index (BAI) accounts for potential variation along the body for each species and was the most precise body condition metric. We hope this study will serve as a guide to help researchers select the most appropriate body condition measurement for their purposes and allow them to incorporate photogrammetric uncertainty associated with these measurements which, in turn, will facilitate comparison of results across studies. 
    more » « less