skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: One model for the learning of language
A major goal of linguistics and cognitive science is to understand what class of learning systems can acquire natural language. Until recently, the computational requirements of language have been used to argue that learning is impossible without a highly constrained hypothesis space. Here, we describe a learning system that is maximally unconstrained, operating over the space of all computations, and is able to acquire many of the key structures present in natural language from positive evidence alone. We demonstrate this by providing the same learning model with data from 74 distinct formal languages which have been argued to capture key features of language, have been studied in experimental work, or come from an interesting complexity class. The model is able to successfully induce the latent system generating the observed strings from small amounts of evidence in almost all cases, including for regular (e.g., a n , ( a b ) n , and { a , b } + ), context-free (e.g., a n b n ,   a n b n + m , and x x R ), and context-sensitive (e.g., a n b n c n ,   a n b m c n d m , and xx ) languages, as well as for many languages studied in learning experiments. These results show that relatively small amounts of positive evidence can support learning of rich classes of generative computations over structures. The model provides an idealized learning setup upon which additional cognitive constraints and biases can be formalized.  more » « less
Award ID(s):
1901262
PAR ID:
10324879
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
119
Issue:
5
ISSN:
0027-8424
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Logical properties such as negation, implication, and symmetry, despite the fact that they are foundational and threaded through the vocabulary and syntax of known natural languages, pose a special problem for language learning. Their meanings are much harder to identify and isolate in the child’s everyday interaction with referents in the world than concrete things (like spoons and horses) and happenings and acts (like running and jumping) that are much more easily identified, and thus more easily linked to their linguistic labels (spoon, horse, run, jump). Here we concentrate attention on the category of symmetry [a relation R is symmetrical if and only if (iff) for all x, y: if R ( x, y), then R (y, x)], expressed in English by such terms as similar, marry, cousin, and near. After a brief introduction to how symmetry is expressed in English and other well-studied languages, we discuss the appearance and maturation of this category in Nicaraguan Sign Language (NSL). NSL is an emerging language used as the primary, daily means of communication among a population of deaf individuals who could not acquire the surrounding spoken language because they could not hear it, and who were not exposed to a preexisting sign language because there was none available in their community. Remarkably, these individuals treat symmetry, in both semantic and syntactic regards, much as do learners exposed to a previously established language. These findings point to deep human biases in the structures underpinning and constituting human language. 
    more » « less
  2. We test the hypothesis that children acquire knowledge of the successor function — a foundational principle stating that every natural number n has a successor n + 1 — by learning the productive linguistic rules that govern verbal counting. Previous studies report that speakers of languages with less complex count list morphology have greater counting and mathematical knowledge at earlier ages in comparison to speakers of more complex languages (e.g., Miller & Stigler, 1987). Here, we tested whether differences in count list transparency affected children’s acquisition of the successor function in three languages with relatively transparent count lists (Cantonese, Slovenian, and English) and two languages with relatively opaque count lists (Hindi and Gujarati). We measured 3.5- to 6.5-year-old children’s mastery of their count list’s recursive structure with two tasks assessing productive counting, which we then related to a measure of successor function knowledge. While the more opaque languages were associated with lower counting proficiency and successor function task performance in comparison to the more transparent languages, a unique within-language analytic approach revealed a robust relationship between measures of productive counting and successor knowledge in almost every language. We conclude that learning productive rules of counting is a critical step in acquiring knowledge of recursive successor function across languages, and that the timeline for this learning varies as a function of count list transparency. 
    more » « less
  3. Adaptability is a distinguishing feature of the human species: We thrive as hunter-gatherers, farmers, and urbanites. What properties of our brains make us highly adaptable? Here we review neuroscience studies of sensory loss, language acquisition, and cultural skills (reading, mathematics, programming). The evidence supports a flexible specialization account. On the one hand, adaptation is enabled by evolutionarily prepared flexible learning systems, both domain-specific social learning systems (e.g., language) and domain-general systems (frontoparietal reasoning). On the other hand, the functional flexibility of our neural wetware enables us to acquire cognitive capacities not selected for by evolution. Heightened plasticity during a protracted period of development enhances cognitive flexibility. Early in life, local cortical circuits are capable of acquiring a wide range of cognitive capacities. Exuberant cross-network connectivity makes it possible to combine old neural parts in new ways, enabling cognitive flexibility such as language acquisition across modalities (spoken, signed, braille) and cultural skills (math, programming). Together, these features of the human brain make it uniquely adaptable. 
    more » « less
  4. Environmental conservation organizations routinely monitor news content on conservation in protected areas to maintain situational awareness of developments that can have an environmental impact. Existing automated media monitoring systems require large amounts of data labeled by domain experts, which is only feasible at scale for high-resource languages like English. However, such tools are most needed in the global south where the news of interest is mainly in local low-resource languages, and far fewer experts are available to annotate datasets on a sustainable basis. In this paper, we propose NewsSerow, a method to automatically recognize environmental conservation content in low-resource languages. NewsSerow is a pipeline of summarization, in-context few-shot classification, and self-reflection using large language models (LLMs). Using at most 10 demonstration example news articles in Nepali, NewsSerow significantly outperforms other few-shot methods and can achieve comparable performance with models fully fine-tuned using thousands of examples. With NewsSerow, Organization X has been able to deploy the media monitoring tool in Nepal, significantly reducing their operational burden, and ensuring that AI tools for conservation actually reach the communities that need them the most. NewsSerow has also been deployed for countries with other languages like Colombia. 
    more » « less
  5. Ettinger, Allyson; Hunter, Tim; Prickett, Brandon (Ed.)
    We present the first application of modern neural networks to the well studied task of learning word stress systems. We tested our adaptation of a sequence-to-sequence network on the Tesar and Smolensky test set of 124 “languages”, showing that it acquires generalizable representations of stress patterns in a very high proportion of runs. We also show that it learns restricted lexically conditioned patterns, known as stress windows. The ability of this model to acquire lexical idiosyncracies, which are very common in natural language systems, sets it apart from past, non-neural models tested on the Tesar and Smolensky data set. 
    more » « less