NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Modeling morphosyntactic agreement as neural search: a case study of Hindi-Urdu

https://doi.org/10.7275/scil.2148

Zhou, Alan; Wilson, Colin (June 2024, Proceedings of the Society for Computation in Linguistics)

Agreement is central to the morphosyntax of many natural languages. Within contemporary linguistic theory, agreement relations have often been analyzed as the result of a structure-sensitive search operation. Neural language models, which lack an explicit bias for this type of operation, have shown mixed success at capturing morphosyntactic agreement phenomena. This paper develops an alternative neural model that formalizes the search operation in a fully differentiable way using gradient neural attention, and evaluates the model's ability to learn the complex agreement system of Hindi-Urdu from a large-scale dependency treebank and smaller synthetic datasets. We find that this model outperforms standard architectures at generalizing agreement patterns to held-out examples and structures.
more » « less
Full Text Available
Acoustic correlates of the Javanese heavy vs. light distinction: A large-scale corpus study

Xu, S.C. Angela; Wilson, Colin (August 2023, Proceedings of the 20th International Congress of Phonetic Sciences)
Skarnitzl, Radek; Volín, Jan (Ed.)
According to the influential continuum model of phonation, only voiced segments can be specified as creaky or breathy. The present study investigated many possible phonetic correlates of the laryngeal contrast in Javanese word-initial prevocalic stop consonants, drawing upon a spoken corpus of more than 180,000 utterances. The results indicate that the laryngeal contrast is cued by voice onset time (VOT) and several acoustic-phonetic properties of the following vowel, including the first formant (F1) in addition to voice source measurements such as H1*-H2* and cepstral peak prominence (CPP). Taken together these findings indicate that Javanese stops can be both voiceless and breathy, supporting a revision of the continuum model in which voicing and other aspects of phonation are decoupled.
more » « less
Full Text Available
Learning morphology with inductive bias: Evidence from infixation

Wilson, C. (January 2022, Proceedings of the 46th annual Boston University Conference on Language Development)
Gong, Y.; Kpogo, F. (Ed.)
In acquiring morphology, the language learner faces the challenge of identifying both the form of morphemes and their location within words. For example, individuals acquiring Chamorro (Austronesian) must learn an agreement morpheme with the form -um- that is infixed before the first vowel of the stem (1a). This challenge is more difficult when a morpheme has multiple forms and/or locations: in some varieties of Chamorro, the same agreement morpheme appears as mu- prefixed on verbs beginning with a nasal/liquid consonant (1b). The learner could potentially overcome the acquisition challenge by employing strong inductive biases. This hypothesis is consistent with the typological finding that, across languages, morphemes occupy a restricted set of prosodically-defined locations (Yu, 2007) and there are strong correlations between morpheme form and position (Anderson, 1972). We conducted a series of artificial morphology experiments, modeled after the Chamorro pattern, that provide converging evidence for such inductive biases (Pierrehumbert & Nair, 1995; Staroverov & Finley, 2021).
more » « less
Full Text Available
Deep neural networks easily learn unnatural infixation and reduplication patterns

https://doi.org/10.7275/kg38-sc40

Haley, Coleman; Wilson, Colin (February 2021, Proceedings of the Society for Computation in Linguistics)
Ettinger, Allyson; Pavlich, Ellie; Prickett, Brandon (Ed.)
Morphological patterns can involve simple concatenation of fixed strings (e.g., unkind, kindness) or ‘nonconcatenative’ processes such as infixation (e.g., Chamorro l-um-iʔeʔ ‘saw (actor-focus)’, Topping, 1973) and reduplication (e.g., Amele ba-bagawen ‘as he came out’, Roberts, 1987), among many others (e.g., Anderson, 1992; Inkelas, 2014). Recent work has established that deep neural networks are capable of inducing both concatenative and nonconatenative patterns (e.g., Kannand Schütze, 2017; Nelson et al., 2020). In this paper, we verify that encoder-decoder networks can learn and generalize attested types of infixation and reduplication from modest training sets. We show further that the same networks readily learn many infixation and reduplication patterns that are unattested in natural languages, raising questions about their relationship to linguistic theory and viability as models of human learning.
more » « less
Full Text Available
Static Harmonic Grammar: Constraint Conflict without Candidate Comparison

Wilson, Colin (January 2021, Proceedings of the 38th West Coast Conference on Formal Linguistics)
Soo, Rachel; Chow, Una Y.; Nederveen, Sander (Ed.)
In Harmonic Grammar and Optimality Theory, well-formed representations are those that optimally satisfy a set of violable constraints, as determined by candidate comparison under a given weighting or ranking. This paper develops variants of HG/OT in which conflict among violable constraints plays out locally, within each part of a representation, rather than through optimization over alternatives. Static HG/OT has a simple formal definition that has important precedents in classic and modern neural networks and that restricts the logic expressivity of constraint-interaction grammars. The static approach to constraint conflict is illustrated for local segmental phonology (the distribution of vowel height in Cochabamba Quechua) and unbounded feature spreading (nasal harmony as in Johore Malay). A Python implementation of the theory and several demonstrations are available at https://github.com/colincwilson/statgram.
more » « less
Full Text Available

Search for: All records