NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Coherence and Consistency in Neural Sequence Models with Dual-System, Neuro-Symbolic Reasoning

Nye, Maxwell; Tessler, Michael Henry; Tenenbaum, Joshua B; Lake, Brenden M (July 2021, ArXivorg)
null (Ed.)
Human reasoning can often be understood as an interplay between two systems: the intuitive and associative (“System 1”) and the deliberative and logical (“System 2”). Neural sequence models—which have been increasingly successful at performing complex, structured tasks—exhibit the advantages and failure modes of System 1: they are fast and learn patterns from data, but are often inconsistent and incoherent. In this work, we seek a lightweight, training-free means of improving existing System 1-like sequence models by adding System 2-inspired logical reasoning. We explore several variations on this theme in which candidate generations from a neural sequence model are examined for logical consistency by a symbolic reasoning module, which can either accept or reject the generations. Our approach uses neural inference to mediate between the neural System 1 and the logical System 2. Results in robust story generation and grounded instruction-following show that this approach can increase the coherence and accuracy of neurally-based generations.
more » « less
Full Text Available
How young children integrate information sources to infer the meaning of words

https://doi.org/10.1038/s41562-021-01145-1

Bohn, Manuel; Tessler, Michael Henry; Merrick, Megan; Frank, Michael C. (July 2021, Nature Human Behaviour)
null (Ed.)
Before formal education begins, children typically acquire a vocabulary of thousands of words. This learning process requires the use of many different information sources in their social environment, including their current state of knowledge and the context in which they hear words used. How is this information integrated? We specify a developmental model according to which children consider information sources in an age-specific way and integrate them via Bayesian inference. This model accurately predicted 2–5-year-old children’s word learning across a range of experimental conditions in which they had to integrate three information sources. Model comparison suggests that the central locus of development is an increased sensitivity to individual information sources, rather than changes in integration ability. This work presents a developmental theory of information integration during language learning and illustrates how formal models can be used to make a quantitative test of the predictive and explanatory power of competing theories.
more » « less
Full Text Available
Communicating Natural Programs to Humans and Machines

Acquaviva, Sam; Pu, Yewen; Kryven, Marta; Wong, Catherine; Ecanow, Gabrielle E; Nye, Max; Sechopoulos, Theodoros; Tessler, Michael Henry; Tenenbaum, Joshua B (June 2021, ArXivorg)
null (Ed.)
The Abstraction and Reasoning Corpus (ARC) is a set of tasks that tests an agent’s ability to flexibly solve novel problems. While most ARC tasks are easy for humans, they are challenging for state-of-the-art AI. How do we build intelligent systems that can generalize to novel situations and understand human instructions in domains such as ARC? We posit that the answer may be found by studying how humans communicate to each other in solving these tasks. We present LARC, the Language-annotated ARC: a collection of natural language descriptions by a group of human participants, unfamiliar both with ARC and with each other, who instruct each other on how to solve ARC tasks. LARC contains successful instructions for 88% of the ARC tasks. We analyze the collected instructions as ‘natural programs’, finding that most natural program concepts have analogies in typical computer programs. However, unlike how one precisely programs a computer, we find that humans both anticipate and exploit ambiguities to communicate effectively. We demonstrate that a state-of-the-art program synthesis technique, which leverages the additional language annotations, outperforms its language-free counterpart.
more » « less
Full Text Available
A practical introduction to the Rational Speech Act modeling framework

Scontras, Gregory; Tessler, Michael Henry; Franke, Michael (May 2021, ArXivorg)
null (Ed.)
Recent advances in computational cognitive science (i.e., simulation- based probabilistic programs) have paved the way for significant progress in formal, implementable models of pragmatics. Rather than describing a pragmatic reasoning process in prose, these models formalize and implement one, deriving both qualitative and quantitative predictions of human behavior—predictions that consistently prove correct, demonstrating the viability and value of the framework. The current paper provides a practical introduction to and critical assessment of the Bayesian Rational Speech Act modeling framework, unpacking theoretical foundations, exploring technological innovations, and drawing connections to issues beyond current applications.
more » « less
Full Text Available
Integrating emotional expressions with utterances in pragmatic inference

Wu, Yang; Tessler, Michael Henry; Asaba, Mike; Zhu, Peter; Gweon, Hyo; Frank, Michael C (January 2021, Proceedings of the Annual Conference of the Cognitive Science Society)
null (Ed.)
Human communication involves far more than words; speak- ers’ utterances are often accompanied by various kinds of emo- tional expressions. How do listeners represent and integrate these distinct sources of information to make communicative inferences? We first show that people, as listeners, integrate both verbal and emotional information when inferring true states of the world and others’ communicative goals, and then present computational models that formalize these inferences by considering different ways in which these signals might be generated. Results suggest that while listeners understand that utterances and emotional expressions are generated by a bal- ance of speakers’ informational and social goals, they addi- tionally consider the possibility that emotional expressions are noncommunicative signals that directly reflect the speaker’s in- ternal states. These results are consistent with the predictions of a probabilistic model that integrates goal inferences with linguistic and emotional signals, moving us towards a more complete formal theory of human communicative reasoning.
more » « less
Full Text Available
Polite Speech Emerges From Competing Social Goals

https://doi.org/10.1162/opmi_a_00035

Yoon, Erica J.; Tessler, Michael Henry; Goodman, Noah D.; Frank, Michael C. (November 2020, Open Mind)
null (Ed.)
Language is a remarkably efficient tool for transmitting information. Yet human speakers make statements that are inefficient, imprecise, or even contrary to their own beliefs, all in the service of being polite. What rational machinery underlies polite language use? Here, we show that polite speech emerges from the competition of three communicative goals: to convey information, to be kind, and to present oneself in a good light. We formalize this goal tradeoff using a probabilistic model of utterance production, which predicts human utterance choices in socially sensitive situations with high quantitative accuracy, and we show that our full model is superior to its variants with subsets of the three goals. This utility-theoretic approach to speech acts takes a step toward explaining the richness and subtlety of social language use.
more » « less
Full Text Available
Informational goals, sentence structure, and comparison class inference

Tessler, Michael Henry; Tsvilodub, Polina; Snedeker, Jesse; Levy, Roger P. (January 2020, Proceedings of the Annual Conference of the Cognitive Science Society)

Understanding a gradable adjective (e.g., big) requires making reference to a comparison class, a set of objects or entities against which the referent is implicitly compared (e.g., big for a Great Dane), but how do listeners decide upon a comparison class? Simple models of semantic composition stipulate that the adjective combines with a noun, which necessarily be- comes the comparison class (e.g., “That Great Dane is big” means big for a Great Dane). We investigate an alternative hypothesis built on the idea that the utility of a noun in an adjectival utterance can be either for reference (getting the listener to attend to the right object) or predication (describing a property of the referent). Therefore, we hypothesize that when the presence of a noun N can be explained away by its utility in reference (e.g., being in the subject position: “That N is big”), it is less likely to set the comparison class. Across three pre-registered experiments, we find evidence that listeners use the noun as a cue to infer comparison classes consistent with a trade-off between reference and predication. This work highlights the complexity of the relation between the form of an utterance and its meaning.
more » « less
Full Text Available
How many observations is one generic worth?

Tessler, Michael Henry; Bridgers, Sophie; Tenenbaum, Joshua B. (January 2020, Proceedings of the Annual Conference of the Cognitive Science Society)

Generic language (e.g., “Birds fly”) conveys generalizations about categories and is essential for learning beyond our direct experience. The meaning of generic language is notoriously hard to specify, however (e.g., penguins don’t fly). Tessler and Goodman (2019) proposed a model for generics that is mathematically equivalent to Bayesian belief-updating based on a single pedagogical example, suggesting a deep connection be- tween learning from experience and learning from language. Relatedly, Csibra and Shamsudheen (2015) argue that generics are inherently pedagogical, understood by infants as referring to a member of a kind. In two experiments with adults, we quantify the exchange-rate between generics and observations by relating their belief-updating capacity, varying both the number of observations and whether they are presented pedagogically or incidentally. We find generics convey stronger generalizations than single pedagogical observations (Expt. 1), even when the property is explicitly demarcated (Expt. 2). We suggest revisions to the vague quantifier model of generics that would allow it to accommodate this intriguing exchange-rate.
more » « less
Full Text Available
Leveraging Unstructured Statistical Knowledge in a Probabilistic Language of Thought

Lew, Alexander K.; Tessler, Michael Henry; Mansinghka, Vikash K.; Tenenbaum, Joshua B. (January 2020, Proceedings of the Annual Conference of the Cognitive Science Society)

One hallmark of human reasoning is that we can bring to bear a diverse web of common-sense knowledge in any situation. The vastness of our knowledge poses a challenge for the practical implementation of reasoning systems as well as for our cognitive theories – how do people represent their common-sense knowledge? On the one hand, our best models of sophisticated reasoning are top-down, making use primarily of symbolically-encoded knowledge. On the other, much of our understanding of the statistical properties of our environment may arise in a bottom-up fashion, for example through asso- ciationist learning mechanisms. Indeed, recent advances in AI have enabled the development of billion-parameter language models that can scour for patterns in gigabytes of text from the web, picking up a surprising amount of common-sense knowledge along the way—but they fail to learn the structure of coherent reasoning. We propose combining these approaches, by embedding language-model-backed primitives into a state- of-the-art probabilistic programming language (PPL). On two open-ended reasoning tasks, we show that our PPL models with neural knowledge components characterize the distribution of human responses more accurately than the neural language models alone, raising interesting questions about how people might use language as an interface to common-sense knowledge, and suggesting that building probabilistic models with neural language-model components may be a promising approach for more human-like AI.
more » « less
Full Text Available

Search for: All records