Artificial Intelligence (AI) is a transformative force in communication and messaging strategy, with potential to disrupt traditional approaches. Large language models (LLMs), a form of AI, are capable of generating high-quality, humanlike text. We investigate the persuasive quality of AI-generated messages to understand how AI could impact public health messaging. Specifically, through a series of studies designed to characterize and evaluate generative AI in developing public health messages, we analyze COVID-19 pro-vaccination messages generated by GPT-3, a state-of-the-art instantiation of a large language model. Study 1 is a systematic evaluation of GPT-3's ability to generate pro-vaccination messages. Study 2 then observed peoples' perceptions of curated GPT-3-generated messages compared to human-authored messages released by the CDC (Centers for Disease Control and Prevention), finding that GPT-3 messages were perceived as more effective, stronger arguments, and evoked more positive attitudes than CDC messages. Finally, Study 3 assessed the role of source labels on perceived quality, finding that while participants preferred AI-generated messages, they expressed dispreference for messages that were labeled as AI-generated. The results suggest that, with human supervision, AI can be used to create effective public health messages, but that individuals prefer their public health messages to come from human institutions rather than AI sources. We propose best practices for assessing generative outputs of large language models in future social science research and ways health professionals can use AI systems to augment public health messaging.
more »
« less
The debate over understanding in AI’s large language models
We survey a current, heated debate in the artificial intelligence (AI) research community on whether large pretrained language models can be said to understand language—and the physical and social situations language encodes—in any humanlike sense. We describe arguments that have been made for and against such understanding and key questions for the broader sciences of intelligence that have arisen in light of these arguments. We contend that an extended science of intelligence can be developed that will provide insight into distinct modes of understanding, their strengths and limitations, and the challenge of integrating diverse forms of cognition.
more »
« less
- Award ID(s):
- 2020103
- PAR ID:
- 10486508
- Publisher / Repository:
- National Academy of Sciences
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 120
- Issue:
- 13
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Using poetic metaphors in the Serbian language, we identified systematic variations in the impact of fluid and crystalized intelligence on comprehen-sion of metaphors that varied in rated aptness and familiarity. Overall, comprehension scores were higher for metaphors that were high rather than low in aptness, and high rather than low in familiarity. A measure of crystalized intelligence was a robust predictor of comprehension across the full range of metaphors, but especially for those that were either relatively unfamiliar or more apt. In contrast, individual differences associated with fluid intelligence were clearly found only for metaphors that were low in aptness. Superior verbal knowledge appears to be particularly important when trying to find meaning in novel metaphorical expressions, and also when exploring the rich interpretive potential of apt metaphors. The broad role of crystalized intelligence in metaphor comprehension is consistent with the view that metaphors are largely understood using semantic integration processes continuous with those that operate in understanding literal language.more » « less
-
Abstract Understanding and communicating uncertainty is a key skill needed in the practice of science. However, there has been little research on the instruction of uncertainty in undergraduate science education. Our team designed a module within an online geoscience field course which focused on explicit instruction around uncertainty and provided students with an uncertainty rating scale to record and communicate their uncertainty with a common language. Students then explored a complex, real-world geological problem about which expert scientists had previously made competing claims through geologic maps. Provided with data, expert uncertainty ratings, and the previous claims, students made new geologic maps of their own and presented arguments about their claims in written form. We analyzed these reports along with assessments of uncertainty. Most students explicitly requested geologists’ uncertainty judgments in a post-course assessment when asked why scientists might differ in their conclusions and/or utilized the rating scale unprompted in their written arguments. Through the examination of both pre- and post-course assessments of uncertainty and students’ course-based assessments, we argue that explicit instruction around uncertainty can be introduced during undergraduate coursework and could facilitate geoscience novices developing into practicing geoscientists.more » « less
-
Second language learners studying languages with a diverse set of prepositions often find preposition usage difficult to master, which can manifest in second language writing as preposition errors that appear to result from transfer from a native language, or interlingual errors. We envision a digital writing assistant for language learners and teachers that can provide targeted feedback on these errors. To address these errors, we turn to the task of preposition error detection, which remains an open problem despite the many methods that have been proposed. In this paper, we explore various classifiers, with and without neural network-based features, and finetuned BERT models for detecting preposition errors between verbs and their noun arguments.more » « less
-
Concurrent objects form the foundation of many applications that exploit multicore architectures and their importance has lead to informal correctness arguments, as well as formal proof systems. Correctness arguments (as found in the distributed computing literature) give intuitive descriptions of a few canonical executions or scenarios often each with only a few threads, yet it remains unknown as to whether these intuitive arguments have a formal grounding and extend to arbitrary interleavings over unboundedly many threads. We present a novel proof technique for concurrent objects, based around identifying a small set of scenarios (representative, canonical interleavings), formalized as the commutativity quotient of a concurrent object. We next give an expression language for defining abstractions of the quotient in the form of regular or context-free languages that enable simple proofs of linearizability. These quotient expressions organize unbounded interleavings into a form more amenable to reasoning and make explicit the relationship between implementation-level contention/interference and ADT-level transitions. We evaluate our work on numerous non-trivial concurrent objects from the literature (including the Michael-Scott queue, Elimination stack, SLS reservation queue, RDCSS and Herlihy-Wing queue). We show that quotients capture the diverse features/complexities of these algorithms, can be used even when linearization points are not straight-forward, correspond to original authors' correctness arguments, and provide some new scenario-based arguments. Finally, we show that discovery of some object's quotients reduces to two-thread reasoning and give an implementation that can derive candidate quotients expressions from source code.more » « less