We provide a formal framework for analyzing syntactic island effects from a subregular perspective. Key aspects of the syntactic representation are encoded as strings where precedence represents containment. Island effects then are expressed as constraints on the shape of these strings. The constraints fit in the class IBSP (Interval-Based Strictly Piecewise), which has been previously explored in subregular phonology. Consequently, the characterization of islands in terms of IBSP string constraints not only provides a computational upper bound on the inventory of feasible island effects, but also establishes a surprising link between syntax on the one hand and phonology on the other.
more »
« less
This content will become publicly available on January 1, 2026
Automata for subregular syntax: Syntax with strings attached
Building on recent work in subregular syntax, we argue that syntactic constraints are best understood as operating not over trees, but rather strings that track structural relations such as dominance and c-command. Even constraints that seem intrinsically tied to trees (e.g. constraints on tree tiers) can be reduced to such strings. We define serial constraints as an abstraction that decomposes string constraints into a context function (which associates nodes with strings) and a requirement function (which enforces constraints on these strings). We provide a general procedure for implementing serial constraints as deterministic tree automata. The construction reveals that the many types of constraints found in subregular syntax are variants of the same computational template. Our findings open up a string-based perspective on syntactic constraints and provide a new, very general approach to the automata-theoretic study of subregular complexity.
more »
« less
- Award ID(s):
- 1845344
- PAR ID:
- 10608500
- Publisher / Repository:
- University of Massachusetts Amherst Libraries
- Date Published:
- Journal Name:
- Proceedings of the Society for Computation in Linguistics
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2834-1007
- Subject(s) / Keyword(s):
- syntax subregular complexity string representations tree automata Minimalist grammars
- Format(s):
- Medium: X Other: application/pdf
- Right(s):
- Creative Commons Attribution 4.0
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Various aspects of syntax have recently been characterized in subregular terms. However, these characterizations operate over very different representations, including string encodings of c-command relations as well as tiers projected from derivation trees. We present a way to unify these approaches via sensing tree automata over Minimalist grammar dependency trees. Sensing tree automata are deterministic top-down tree automata that may inspect the labels of all daughter nodes before assigning them specific states. It is already known that these automata cannot correctly enforce all movement dependencies in Minimalist grammars, but we show that this result no longer holds if one takes into account several well-established empirical restrictions on movement. Sensing tree automata thus furnish a strong yet uniform upper bound on the complexity of syntactic dependencies.more » « less
-
Lexical selection, functional hierarchies, and adjunct ordering are arguably distinct parts of syntax, yet are surprisingly similar in their computational properties. All three fall within the formal class strictly local (SL) and thus are maximally simple. Many phonological patterns are also SL, motivating a more detailed comparison. Towards this end, I develop a model based on com- mand strings (Graf & Shafiei 2019) which allows syntactic and phonological grammars to be visualized using finite-state automata. Using this model, I show that the same basic patterns allowed within SL occur in both domains.more » « less
-
Extending prior work in Graf (2018, 2020, 2022c), I show that movement is tier-based strictly local (TSL) even if one analyzes it as a transformation, i.e. a tree transduction from derivation trees to output trees. I define input strictly local (ISL) tree-to-tree transductions with (lexical) TSL tests as a tier-based extension of ISL tree-to-tree transductions. TSL tests allow us to attach each mover to all its landing sites. In general, this class of transductions fails to attach each mover to its final landing site to the exclusion of all its intermediate landing sites, which is crucial for producing output trees with the correct string yield. The problem is avoided, though, if syntax enforces a variant of the Ban on Improper Movement. Subregular complexity thus provides a novel motivation for core restrictions on movement while also shedding new light on the choice between copies and traces in syntax.more » « less
-
null (Ed.)This paper studies the task of Relation Extraction (RE) that aims to identify the semantic relations between two entity mentions in text. In the deep learning models for RE, it has been beneficial to incorporate the syntactic structures from the dependency trees of the input sentences. In such models, the dependency trees are often used to directly structure the network architectures or to obtain the dependency relations between the word pairs to inject the syntactic information into the models via multi-task learning. The major problem with these approaches is the lack of generalization beyond the syntactic structures in the training data or the failure to capture the syntactic importance of the words for RE. In order to overcome these issues, we propose a novel deep learning model for RE that uses the dependency trees to extract the syntax-based importance scores for the words, serving as a tree representation to introduce syntactic information into the models with greater generalization. In particular, we leverage Ordered-Neuron Long-Short Term Memory Networks (ON-LSTM) to infer the model-based importance scores for RE for every word in the sentences that are then regulated to be consistent with the syntax-based scores to enable syntactic information injection. We perform extensive experiments to demonstrate the effectiveness of the proposed method, leading to the state-of-the-art performance on three RE benchmark datasets.more » « less
An official website of the United States government
