skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Noisy-Channel Processing in Standard Arabic Relative Clauses
This study investigates sentence processing in Standard Arabic (SA) by examining subject- and object-extracted relative clauses (SRCs and ORCs) through eye tracking. We test memory- and expectation-based theories of processing difficulty, and whether good-enough or noisy-channel processing leads to misinterpretations in ORCs. Our results find increased processing difficulty in ORCs, supporting expectation-based theories; however, this processing difficulty is not localized to the disambiguating region (relative clause verb) as predicted, but rather at the integration of the second noun phrase (relative clause NP). The findings support goodenough/noisy-channel processing theories, suggesting that readers may accept a noisy SRC interpretation of an ORC, and thus bypass integration costs at the RC NP.  more » « less
Award ID(s):
2235106
PAR ID:
10609576
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proceedings of the Annual Meeting of the Cognitive Science Society
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This study investigates sentence processing in Standard Arabic (SA) by examining subject- and object-extracted relative clauses (SRCs and ORCs) through eye tracking. We test memory- and expectation-based theories of processing difficulty, and whether good-enough or noisy-channel processing leads to misinterpretations in ORCs. Our results find increased processing difficulty in ORCs, supporting expectation-based theories; however, this processing difficulty is not localized to the disambiguating region (relative clause verb) as predicted, but rather at the integration of the second noun phrase (relative clause NP). The findings support goodenough/noisy-channel processing theories, suggesting that readers may accept a noisy SRC interpretation of an ORC, and thus bypass integration costs at the RC NP. 
    more » « less
  2. A major goal of psycholinguistic theory is to account for the cognitive constraints limiting the speed and ease of language comprehension and production. Wide-ranging evidence demonstrates a key role for linguistic expectations: A word’s predictability, as measured by the information-theoretic quantity of surprisal, is a major determinant of processing difficulty. But surprisal, under standard theories, fails to predict the difficulty profile of an important class of linguistic patterns: the nested hierarchical structures made possible by recursion in human language. These nested structures are better accounted for by psycholinguistic theories of constrained working memory capacity. However, progress on theory unifying expectation-based and memory-based accounts has been limited. Here we present a unified theory of a rational trade-off between precision of memory representations with ease of prediction, a scaled-up computational implementation using contemporary machine learning methods, and experimental evidence in support of the theory’s distinctive predictions. We show that the theory makes nuanced and distinctive predictions for difficulty patterns in nested recursive structures predicted by neither expectation-based nor memory-based theories alone. These predictions are confirmed 1) in two language comprehension experiments in English, and 2) in sentence completions in English, Spanish, and German. More generally, our framework offers computationally explicit theory and methods for understanding how memory constraints and prediction interact in human language comprehension and production. 
    more » « less
  3. Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context. While predictability quantified via surprisal has gained empirical support, this representation-agnostic measure leaves open the question of how to best approximate the human comprehender's latent probability model. This article first describes an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. The article then evaluates a variety of structural parsers and deep neural language models as cognitive models of sentence processing by comparing the predictive power of their surprisal estimates on self-paced reading, eye-tracking, and fMRI data collected during real-time language processing. The results show that surprisal estimates from the proposed left-corner processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data when compared to those from neural language models trained on much more data. This may suggest that the strong linguistic generalizations made by the proposed processing model may help predict humanlike processing costs that manifest in latency-based measures, even when the amount of training data is limited. Additionally, experiments using Transformer-based language models sharing the same primary architecture and training data show a surprising negative correlation between parameter count and fit to self-paced reading and eye-tracking data. These findings suggest that large-scale neural language models are making weaker generalizations based on patterns of lexical items rather than stronger, more humanlike generalizations based on linguistic structure. 
    more » « less
  4. Bouyer, Patricia; Srinivasan, Srikanth (Ed.)
    In recent years the framework of learning from label proportions (LLP) has been gaining importance in machine learning. In this setting, the training examples are aggregated into subsets or bags and only the average label per bag is available for learning an example-level predictor. This generalizes traditional PAC learning which is the special case of unit-sized bags. The computational learning aspects of LLP were studied in recent works [R. Saket, 2021; R. Saket, 2022] which showed algorithms and hardness for learning halfspaces in the LLP setting. In this work we focus on the intractability of LLP learning Boolean functions. Our first result shows that given a collection of bags of size at most 2 which are consistent with an OR function, it is NP-hard to find a CNF of constantly many clauses which satisfies any constant-fraction of the bags. This is in contrast with the work of [R. Saket, 2021] which gave a (2/5)-approximation for learning ORs using a halfspace. Thus, our result provides a separation between constant clause CNFs and halfspaces as hypotheses for LLP learning ORs. Next, we prove the hardness of satisfying more than 1/2 + o(1) fraction of such bags using a t-DNF (i.e. DNF where each term has ≤ t literals) for any constant t. In usual PAC learning such a hardness was known [S. Khot and R. Saket, 2008] only for learning noisy ORs. We also study the learnability of parities and show that it is NP-hard to satisfy more than (q/2^{q-1} + o(1))-fraction of q-sized bags which are consistent with a parity using a parity, while a random parity based algorithm achieves a (1/2^{q-2})-approximation. 
    more » « less
  5. ABSTRACT Odd radio circles (ORCs) are recently-discovered faint diffuse circles of radio emission, of unknown cause, surrounding galaxies at moderate redshift (z ∼ 0.2 – 0.6). Here, we present detailed new MeerKAT radio images at 1284 MHz of the first ORC, originally discovered with the Australian Square Kilometre Array Pathfinder, with higher resolution (6 arcsec) and sensitivity (∼ 2.4 μJy/beam). In addition to the new images, which reveal a complex internal structure consisting of multiple arcs, we also present polarization and spectral index maps. Based on these new data, we consider potential mechanisms that may generate the ORCs. 
    more » « less