Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract This paper reexamines a recent claim that Large Language Models lag behind humans in language comprehension on what were described as minimally complex statements. We argue that human performance was overestimated and LM performance, underestimated. Moreover, both people and lower-performing LMs are disproportionately challenged by queries involving potentially appropriate inferences, suggesting shared pragmatic sensitivity rather than model-specific deficits. Analysis of more sensitive log probabilities of Llama-2-70B demonstrate ceiling-level accuracy and pragmatic sensitivity. A separate group of LM grammaticality judgments previously characterized as incorrect are shown to correlate with human judgments, while certain reasoning models approximate idealized judgments when prompted to respond as an expert generative syntactician. Overall, the findings suggest that apparent deficits in LM performance may reflect task design, evaluation choices, and assumptions about human performance, rather than deficiencies in current models.more » « less
-
Abstract What have language models (LMs) learned about grammar? This question remains hotly debated, with major ramifications for linguistic theory. However, since probability and grammaticality are distinct notions in linguistics, it is not obvious what string probabilities can reveal about an LM’s underlying grammatical knowledge. We present a theoretical analysis of the relationship between grammar, meaning, and string probability, based on simple assumptions about the generative process of corpus data. Our framework makes three predictions, which we validate empirically using 280K sentence pairs in English and Chinese: (1) correlation between the probability of strings within minimal pairs, i.e., string pairs with minimal semantic differences; (2) correlation between models’ and humans’ deltas within minimal pairs; and (3) poor separation in probability space between unpaired grammatical and ungrammatical strings. Our analyses give theoretical grounding for using probability to learn about LMs’ structural knowledge, and suggest directions for future work in LM grammatical evaluation.more » « less
-
Abstract Language models can produce fluent, grammatical text. Nonetheless, some maintain that language models don’t really learn language and also that, even if they did, that would not be informative for the study of human learning and processing. On the other side, there have been claims that the success of LMs obviates the need for studying linguistic theory and structure. We argue that both extremes are wrong. LMs can contribute to fundamental questions about linguistic structure, language processing, and learning. They force us to rethink arguments and ways of thinking that have been foundational in linguistics. While they do not replace linguistic structure and theory, they serve as model systems and working proofs of concept for gradient, usage-based approaches to language. We offer an optimistic take on the relationship between language models and linguistics.more » « less
An official website of the United States government

Full Text Available