skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities
Award ID(s):
2020945
PAR ID:
10443627
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)
Page Range / eLocation ID:
301-313
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation. This success has sparked interest in probing whether these models are inducing human-like grammatical knowledge from the raw data they are exposed to and, consequently, whether they can shed new light on long-standing debates concerning the innate structure necessary for language acquisition. In this article, we survey representative studies of the syntactic abilities of deep networks and discuss the broader implications that this work has for theoretical linguistics. 
    more » « less
  2. Accurate prosody prediction from text leads to more natural-sounding TTS. In this work, we employ a new set of fea- tures to predict ToBI pitch accent and phrase boundaries from text. We investigate a wide variety of text-based features, in- cluding many new syntactic features, several types of word em- beddings, co-reference features, LIWC features, and specificity information. We focus our work on the Boston Radio News Corpus, a ToBI-labeled corpus of relatively clean news broad- casts, but also test our classifiers on Audix, a smaller corpus of read news, and on the Columbia Games Corpus, a corpus of conversational speech, in order to test the applicability of our model in cross-corpus settings. Our results show strong per- formance on both tasks, as well as some promising results for cross-corpus applications of our models. 
    more » « less
  3. Many long-distance linguistic dependencies across domains can be modeled as tier-based strictly local (TSL) patterns (Graf 2022a). Such patterns are in principle efficiently learnable, but known algorithms require unrealistic conditions. Heuser et al. (2024) present an algorithm for learning syntactic islands which involves memorizing local bigrams along attested paths; no tiers are involved. I propose a method for inferring tier membership which augments the latter algorithm to produce a TSL grammar, and show that this model also derives a version of the Height-Locality Connection (Keine 2019). 
    more » « less