skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Universal Features in Phonological Neighbor Networks
Human speech perception involves transforming a countinuous acoustic signal into discrete linguistically meaningful units (phonemes) while simultaneously causing a listener to activate words that are similar to the spoken utterance and to each other. The Neighborhood Activation Model posits that phonological neighbors (two forms [words] that differ by one phoneme) compete significantly for recognition as a spoken word is heard. This definition of phonological similarity can be extended to an entire corpus of forms to produce a phonological neighbor network (PNN). We study PNNs for five languages: English, Spanish, French, Dutch, and German. Consistent with previous work, we find that the PNNs share a consistent set of topological features. Using an approach that generates random lexicons with increasing levels of phonological realism, we show that even random forms with minimal relationship to any real language, combined with only the empirical distribution of language-specific phonological form lengths, are sufficient to produce the topological properties observed in the real language PNNs. The resulting pseudo-PNNs are insensitive to the level of lingustic realism in the random lexicons but quite sensitive to the shape of the form length distribution. We therefore conclude that “universal” features seen across multiple languages are really string universals, not language universals, and arise primarily due to limitations in the kinds of networks generated by the one-step neighbor definition. Taken together, our results indicate that caution is warranted when linking the dynamics of human spoken word recognition to the topological properties of PNNs, and that the investigation of alternative similarity metrics for phonological forms should be a priority.  more » « less
Award ID(s):
1735225
PAR ID:
10097489
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Entropy
Volume:
20
Issue:
7
ISSN:
1099-4300
Page Range / eLocation ID:
526
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The view that words are arbitrary is a foundational assumption about language, used to set human languages apart from nonhuman communication. We present here a study of the alignment between the semantic and phonological structure (systematicity) of American Sign Language (ASL), and for comparison, two spoken languages—English and Spanish. Across all three languages, words that are semantically related are more likely to be phonologically related, highlighting systematic alignment between word form and word meaning. Critically, there is a significant effect of iconicity (a perceived physical resemblance between word form and word meaning) on this alignment: words are most likely to be phonologically related when they are semantically related and iconic. This phenomenon is particularly widespread in ASL: half of the signs in the ASL lexicon areiconicallyrelated to other signs, i.e., there is a nonarbitrary relationship between form and meaning that is shared across signs. Taken together, the results reveal that iconicity can act as a driving force behind the alignment between the semantic and phonological structure of spoken and signed languages, but languages may differ in the extent that iconicity structures the lexicon. Theories of language must account for iconicity as a possible organizing principle of the lexicon. 
    more » « less
  2. null (Ed.)
    Indonesian language is heavily riddled with colloquialism whether in written or spoken forms. In this paper, we identify a class of Indonesian colloquial words that have undergone morphological transformations from their standard forms, categorize their word formations, and propose a benchmark dataset of Indonesian Colloquial Lexicons (IndoCollex) consisting of informal words on Twitter expertly annotated with their standard forms and their word formation types/tags. We evalu- ate several models for character-level transduction to perform morphological word normalization on this testbed to understand their failure cases and provide baselines for future work. As IndoCollex catalogues word formation phenomena that are also present in the non-standard text of other languages, it can also provide an attractive testbed for methods tailored for cross-lingual word normalization and non-standard word formation. 
    more » « less
  3. The research community generally accepts that signed and spoken languages contain both iconicity and arbitrariness. Iconicity's impact on statistical distributions of motivated forms throughout signed language lexicons is clear (e.g. Occhino, 2016). However, there has been little work to determine whether the iconic links between form and meaning are relevant only to a sign's initial formation, or if these links are stored as part of lexical representations. In the present study, 40 Deaf signers of American Sign Language were asked to rate two-handed signs in their citation form and in one-handed (reduced) forms. Twelve signs were highly iconic. For each of these highly iconic sign, a less iconic but phonologically similar sign of the same grammatical category was also chosen. Signs were presented in carrier sentences and in isolation. Participants preferred one-handed forms of the highly iconic signs over one-handed forms of their phonolgogically similar but less iconic counterparts. Thus, iconicity impacted the application of a synchronic phonological process. This finding suggests that lexical representations retain iconic form-meaning links and that these links are accessible to the phonological grammar. 
    more » « less
  4. null (Ed.)
    Non-native speakers show difficulties with spoken word processing. Many studies attribute these difficulties to imprecise phonological encoding of words in the lexical memory. We test an alternative hypothesis: that some of these difficulties can arise from the non-native speakers' phonetic perception. We train a computational model of phonetic learning, which has no access to phonology, on either one or two languages. We first show that the model exhibits predictable behaviors on phone-level and word-level discrimination tasks. We then test the model on a spoken word processing task, showing that phonology may not be necessary to explain some of the word processing effects observed in non-native speakers. We run an additional analysis of the model's lexical representation space, showing that the two training languages are not fully separated in that space, similarly to the languages of a bilingual human speaker. 
    more » « less
  5. abstract A growing body of research shows that both signed and spoken languages display regular patterns of iconicity in their vocabularies. We compared iconicity in the lexicons of American Sign Language (ASL) and English by combining previously collected ratings of ASL signs (Caselli, Sevcikova Sehyr, Cohen-Goldberg, & Emmorey, 2017) and English words (Winter, Perlman, Perry, & Lupyan, 2017) with the use of data-driven semantic vectors derived from English. Our analyses show that models of spoken language lexical semantics drawn from large text corpora can be useful for predicting the iconicity of signs as well as words. Compared to English, ASL has a greater number of regions of semantic space with concentrations of highly iconic vocabulary. There was an overall negative relationship between semantic density and the iconicity of both English words and ASL signs. This negative relationship disappeared for highly iconic signs, suggesting that iconic forms may be more easily discriminable in ASL than in English. Our findings contribute to an increasingly detailed picture of how iconicity is distributed across different languages. 
    more » « less