skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Philippine Psycholinguistics
Over the last decade, there has been a slow but steady accumulation of psycholinguistic research focusing on typologically diverse languages. In this review, we provide an overview of the psycholinguistic research on Philippine languages at the sentence level. We first discuss the grammatical features of these languages that figure prominently in existing research. We identify four linguistic domains that have received attention from language researchers and summarize the empirical terrain. We advance two claims that emerge across these different domains: ( a) The agent-first pressure plays a central role in many of the findings, and ( b) the generalization that the patient argument is the syntactically privileged argument cannot be reduced to frequency, but instead is an emergent phenomenon caused by the alignment of competing pressures toward an optimal candidate. We connect these language-specific claims to language-general theories of sentence processing.  more » « less
Award ID(s):
2204112
PAR ID:
10556628
Author(s) / Creator(s):
;
Publisher / Repository:
Annual Review
Date Published:
Journal Name:
Annual Review of Linguistics
Volume:
10
Issue:
1
ISSN:
2333-9683
Page Range / eLocation ID:
145 to 167
Subject(s) / Keyword(s):
Tagalog, Philippine languages, field psycholinguistics, agent-first, patient primacy
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Research in the social sciences and psychology has shown that the persuasiveness of an argument depends not only the language employed, but also on attributes of the source/communicator, the audience, and the appropriateness and strength of the argument’s claims given the pragmatic and discourse context of the argument. Among these characteristics of persuasive arguments, prior work in NLP does not explicitly investigate the effect of the pragmatic and discourse context when determining argument quality. This paper presents a new dataset to initiate the study of this aspect of argumentation: it consists of a diverse collection of arguments covering 741 controversial topics and comprising over 47,000 claims. We further propose predictive models that incorporate the pragmatic and discourse context of argumentative claims and show that they outperform models that rely only on claim-specific linguistic features for predicting the perceived impact of individual claims within a particular line of argument. 
    more » « less
  2. Event concepts of common verbs (e.g. eat, sleep) can be broadly shared across languages, but a given language’s rules for subcategorization are largely arbitrary and vary substantially across languages. When subcategorization information does not match between first language (L1) and second language (L2), how does this mismatch impact L2 speakers in real time? We hypothesized that subcategorization knowledge in L1 is particularly difficult for L2 speakers to override online. Event-related potential (ERP) responses were recorded from English sentences that include verbs that were ambitransitive in Mandarin but intransitive in English (*  My sister listened the music). While L1 English speakers showed a prominent P600 effect to subcategorization violations, L2 English speakers whose L1 was Mandarin showed some sensitivity in offline responses but not in ERPs. This suggests that computing verb–argument relations, although seemingly one of the basic components of sentence comprehension, in fact requires accessing lexical syntax which may be vulnerable to L1 interference in L2. However, our exploratory analysis showed that more native-like behavioral accuracy was associated with a more native-like P600 effect, suggesting that, with enough experience, L2 speakers can ultimately overcome this interference. 
    more » « less
  3. We investigate the problem of sentence-level supporting argument detection from relevant documents for user-specified claims. A dataset containing claims and associated citation articles is collected from online debate website idebate.org. We then manually label sentence-level supporting arguments from the documents along with their types as study, factual, opinion, or reasoning. We further characterize arguments of different types, and explore whether leveraging type information can facilitate the supporting arguments detection task. Experimental results show that LambdaMART (Burges, 2010) ranker that uses features informed by argument types yields better performance than the same ranker trained without type information. 
    more » « less
  4. Linguistic analysis is a core task in the process of documenting, analyzing, and describing endangered and less-studied languages. In addition to providing insight into the properties of the language being studied, having tools to automatically label words in a language for grammatical category and morphological features can support a range of applications useful for language pedagogy and revitalization. At the same time, most modern NLP methods for these tasks require both large amounts of data in the language and compute costs well beyond the capacity of most research groups and language communities. In this paper, we present a gloss-to-gloss (g2g) model for linguistic analysis (specifically, morphological analysis and part-of-speech tagging) that is lightweight in terms of both data requirements and computational expense. The model is designed for the interlinear glossed text (IGT) format, in which we expect the source text of a sentence in a low-resource language, a translation of that sentence into a language of wider communication, and a detailed glossing of the morphological properties of each word in the sentence. We first produce silver standard parallel glossed data by automatically labeling the high-resource translation. The model then learns to transform source language morphological labels into output labels for the target language, mediated by a structured linguistic representation layer. We test the model on both low-resource and high-resource languages, and find that our simple CNN-based model achieves comparable performance to a state-of-the-art transformer-based model, at a fraction of the computational cost. 
    more » « less
  5. A major goal of psycholinguistic theory is to account for the cognitive constraints limiting the speed and ease of language comprehension and production. Wide-ranging evidence demonstrates a key role for linguistic expectations: A word’s predictability, as measured by the information-theoretic quantity of surprisal, is a major determinant of processing difficulty. But surprisal, under standard theories, fails to predict the difficulty profile of an important class of linguistic patterns: the nested hierarchical structures made possible by recursion in human language. These nested structures are better accounted for by psycholinguistic theories of constrained working memory capacity. However, progress on theory unifying expectation-based and memory-based accounts has been limited. Here we present a unified theory of a rational trade-off between precision of memory representations with ease of prediction, a scaled-up computational implementation using contemporary machine learning methods, and experimental evidence in support of the theory’s distinctive predictions. We show that the theory makes nuanced and distinctive predictions for difficulty patterns in nested recursive structures predicted by neither expectation-based nor memory-based theories alone. These predictions are confirmed 1) in two language comprehension experiments in English, and 2) in sentence completions in English, Spanish, and German. More generally, our framework offers computationally explicit theory and methods for understanding how memory constraints and prediction interact in human language comprehension and production. 
    more » « less