skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Combining forecasts from advisors: The impact of advice independence and verbal versus numeric format.
Past research on advice-taking has suggested that people are often insensitive to the level of advice independence when combining forecasts from advisors. However, this has primarily been tested for cases in which people receive numeric forecasts. Recent work by Mislavsky and Gaertig (2022) shows that people sometimes employ different strategies when combining verbal versus numeric forecasts about the likelihood of future events. Specifically, likelihood judgments based on two verbal forecasts (e.g., "rather likely") are more often extreme (relative to the forecasts) than are likelihood judgments based on two numeric forecasts (e.g., "70% probability"). The goal of the present research was to investigate whether advice-takers' use of combination strategies can be sensitive to advice independence when differences in independence are highly salient and whether sensitivity to advice independence depends on the format in which advice is given. In two studies, we found that advice-takers became more extreme with their own likelihood estimate when combining forecasts from advisors who use separate evidence, as opposed to the same evidence. We also found that two verbal forecasts generally resulted in more extreme combined likelihood estimates than two numeric forecasts. However, the results did not suggest that sensitivity to advice independence depends on the format of advice.  more » « less
Award ID(s):
1851738
PAR ID:
10535270
Author(s) / Creator(s):
;
Publisher / Repository:
APA
Date Published:
Journal Name:
Journal of Experimental Psychology: General
Volume:
153
Issue:
8
ISSN:
0096-3445
Page Range / eLocation ID:
2088 to 2099
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Computational models of verbal analogy and relational similarity judgments can employ different types of vector representations of word meanings (embeddings) generated by machine-learning algorithms. An important question is whether human-like relational processing depends on explicit representations of relations (i.e., representations separable from those of the concepts being related), or whether implicit relation representations suffice. Earlier machine-learning models produced static embeddings for individual words, identical across all contexts. However, more recent Large Language Models (LLMs), which use transformer architectures applied to much larger training corpora, are able to produce contextualized embeddings that have the potential to capture implicit knowledge of semantic relations. Here we compare multiple models based on different types of embeddings to human data concerning judgments of relational similarity and solutions of verbal analogy problems. For two datasets, a model that learns explicit representations of relations, Bayesian Analogy with Relational Transformations (BART), captured human performance more successfully than either a model using static embeddings (Word2vec) or models using contextualized embeddings created by LLMs (BERT, RoBERTa, and GPT-2). These findings support the proposal that human thinking depends on representations that separate relations from the concepts they relate. 
    more » « less
  2. Worthy, Darrell A. (Ed.)
    When making decisions involving risk, people may learn about the risk from descriptions or from experience. The description-experience gap refers to the difference in decision patterns driven by this discrepancy in learning format. Across two experiments, we investigated whether learning from description versus experience differentially affects the direction and the magnitude of a context effect in risky decision making. In Study 1 and 2, a computerized game called the Decisions about Risk Task (DART) was used to measure people’s risk-taking tendencies toward hazard stimuli that exploded probabilistically. The rate at which a context hazard caused harm was manipulated, while the rate at which a focal hazard caused harm was held constant. The format by which this information was learned was also manipulated; it was learned primarily by experience or by description. The results revealed that participants’ behavior toward the focal hazard varied depending on what they had learned about the context hazard. Specifically, there were contrast effects in which participants were more likely to choose a risky behavior toward the focal hazard when the harm rate posed by the context hazard was high rather than low. Critically, these contrast effects were of similar strength irrespective of whether the risk information was learned from experience or description. Participants’ verbal assessments of risk likelihood also showed contrast effects, irrespective of learning format. Although risk information about a context hazard in DART does nothing to affect the objective expected value of risky versus safe behaviors toward focal hazards, it did affect participants’ perceptions and behaviors—regardless of whether the information was learned from description or experience. Our findings suggest that context has a broad-based role in how people assess and make decisions about hazards. 
    more » « less
  3. Many people report experiencing their thoughts in the form of natural language, i.e., they experience ‘inner speech’. At present, there exist few ways of quantifying this tendency, making it difficult to investigate whether the propensity to experience verbalize predicts objective cognitive function or whether it is merely epiphenomenal. We present a new instrument—The Internal Representation Questionnaire (IRQ)—for quantifying the subjective format of internal thoughts. The primary goal of the IRQ is to assess whether people vary in their stated use of visual and verbal strategies in their internal representations. Exploratory analyses revealed four factors: Propensity to form visual images, verbal images, a general mental manipulation factor, and an orthographic imagery factor. Here, we describe the properties of the IRQ and report an initial test of its predictive validity by relating it to a speeded picture/word verification task involving pictorial, written, and auditory verbal cues. 
    more » « less
  4. When asked, large language models (LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems. In this perspectives paper, we discuss possible ways for LLMs to support relevance judgments along with concerns and issues that arise. We devise a human–machine collaboration spectrum that allows to categorize different relevance judgment strategies, based on how much humans rely on machines. For the extreme point of ‘fully automated judgments’, we further include a pilot experiment on whether LLM-based relevance judgments corre- late with judgments from trained human assessors. We conclude the paper by providing opposing perspectives for and against the use of LLMs for automatic relevance judgments, and a compromise per- spective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers 
    more » « less
  5. Rand, David (Ed.)
    Abstract Innumeracy (lack of math skills) among nonscientists often leads climate scientists and others to avoid communicating numbers due to concerns that the public will not understand them and may disengage. However, people often report preferring to receive numbers; providing them also can improve decisions. Here, we demonstrated that the presence vs. absence of at least one Arabic integer in climate-related social-media posts increased sharing up to 31.7% but, counter to hypothesis, decreased liking of messages 5.2% in two preregistered observational studies (climate scientists on Twitter, N  > 8 million Tweets; climate subreddit, N  > 17,000 posts and comments). We speculated that the decreased liking was due, not to reduced engagement, but to more negative feelings towards climate-related content described with numeric precision. A preregistered within-participant experiment (N = 212) then varied whether climate consequences were described using Arabic integers (e.g. “90%”) or another format (e.g. verbal terms, “almost all”). The presence of Arabic integers about consequences led to more sharing, wanting to find out more, and greater trust and perceptions of an expert messenger; perceived trust and expertise appeared to mediate effects on sharing and wanting to find out more. Arabic integers about consequences again led to more negative feelings about the Tweets as if numbers clarified the dismaying magnitude of climate threats. Our results indicate that harnessing the power of numbers could increase public trust and concern regarding this defining issue of our time. Communicators, however, should also consider counteracting associated negative feelings—that could halt action—by providing feasible solutions to increase people's self-efficacy. 
    more » « less