skip to main content


Title: Comparing Single-Word Insertions and Multi-Word Alternations in Bilingual Speech: Insights from Pupillometry
Prominent sociolinguistic theories of language mixing have posited that single-word insertions of one language into the other are the result of a distinct process than multi-word alternations between two languages given that the former overwhelmingly surface morphosyntactically integrated into the surrounding language. To date, this distinction has not been tested in comprehension. The present study makes use of pupillometry to examine the online processing of single-word insertions and multi-word alternations by highly proficient Spanish-English bilinguals in Puerto Rico. Participants heard sentences containing target noun/adjective pairs (1) in unilingual Spanish, (2) where the Spanish noun was replaced with its English translation equivalent, followed by a Spanish post-nominal adjective, and (3) where both the noun and adjective appeared in English with the adjective occurring in the English pre-nominal position. Both types of language mixing elicit larger pupillary responses when compared to unilingual Spanish speech, though the magnitude of this difference depends on the grammatical gender of the target noun. Importantly, single-word insertions and multi-word alternations did not differ from one another. Taken together, these findings suggest that morphosyntactic integration is not the defining feature of single-word insertions, at least in comprehension, and that the comprehension system is tuned to the distributional properties of bilingual speech.  more » « less
Award ID(s):
1823634
NSF-PAR ID:
10412039
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Languages
Volume:
7
Issue:
4
ISSN:
2226-471X
Page Range / eLocation ID:
267
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We investigate how to use pretrained static word embeddings to deliver improved estimates of bilexical co-occurrence probabilities: conditional probabilities of one word given a single other word in a specific relationship. Such probabilities play important roles in psycholinguistics, corpus linguistics, and usage-based cognitive modeling of language more generally. We propose a log-bilinear model taking pretrained vector representations of the two words as input, enabling generalization based on the distributional information contained in both vectors. We show that this model outperforms baselines in estimating probabilities of adjectives given nouns that they attributively modify, and probabilities of nominal direct objects given their head verbs, given limited training data in Arabic, English, Korean, and Spanish. 
    more » « less
  2. Abstract

    Previous work has shown that English native speakers interpret sentences as predicted by a noisy‐channel model: They integrate both the real‐world plausibility of the meaning—the prior—and the likelihood that the intended sentence may be corrupted into the perceived sentence. In this study, we test the noisy‐channel model in Mandarin Chinese, a language taxonomically different from English. We present native Mandarin speakers sentences in a written modality (Experiment 1) and an auditory modality (Experiment 2) in three pairs of syntactic alternations. The critical materials are literally implausible but require differing numbers and types of edits in order to form more plausible sentences. Each sentence is followed by a comprehension question that allows us to infer whether the speakers interpreted the item literally, or made an inference toward a more likely meaning. Similar to previous research on related English constructions, Mandarin participants made the most inferences for implausible materials that could be inferred as plausible by deleting a single morpheme or inserting a single morpheme. Participants were less likely to infer a plausible meaning for materials that could be inferred as plausible by making an exchange across a preposition. And participants were least likely to infer a plausible meaning for materials that could be inferred as plausible by making an exchange across a main verb. Moreover, we found more inferences in written materials than spoken materials, possibly a result of a lack of word boundaries in written Chinese. Overall, the fact that the results were so similar to those found in related constructions in English suggests that the noisy‐channel proposal is robust.

     
    more » « less
  3. Abstract

    Children show a remarkable degree of consistency in learning some words earlier than others. What patterns of word usage predict variations among words in age of acquisition? We use distributional analysis of a naturalistic corpus of child‐directed speech to create quantitative features representing natural variability in word contexts. We evaluate two sets of features: One set is generated from the distribution of words into frames defined by the two adjacent words. These features primarily encode syntactic aspects of word usage. The other set is generated from non‐adjacent co‐occurrences between words. These features encode complementary thematic aspects of word usage. Regression models using these distributional features to predict age of acquisition of 656 early‐acquired English words indicate that both types of features improve predictions over simpler models based on frequency and appearance in salient or simple utterance contexts. Syntactic features were stronger predictors of children's production than comprehension, whereas thematic features were stronger predictors of comprehension. Overall, earlier acquisition was predicted by features representing frames that select for nouns and verbs, and by thematic content related to food and face‐to‐face play topics; later acquisition was predicted by features representing frames that select for pronouns and question words, and by content related to narratives and object play.

     
    more » « less
  4. The present study examined the role of script in bilingual speech planning by comparing the performance of same and different-script bilinguals. Spanish-English bilinguals (Experiment 1) and Japanese-English bilinguals (Experiment 2) performed a picture-word interference task in which they were asked to name a picture of an object in English, their second language, while ignoring a visual distractor word in Spanish or Japanese, their first language. Results replicated the general pattern seen in previous bilingual picture-word interference studies for the same-script, Spanish-English bilinguals but not for the different-script, Japanese-English bilinguals. Both groups showed translation facilitation, whereas only Spanish-English bilinguals demonstrated semantic interference, phonological facilitation, and phono-translation facilitation. These results suggest that when the script of the language not in use is present in the task, bilinguals appear to exploit the perceptual difference as a language cue to direct lexical access to the intended language earlier in the process of speech planning. 
    more » « less
  5. Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. In settings where unlabelled speech is the only available resource, such embeddings can be used in "zero-resource" speech search, indexing and discovery systems. Here we propose to train a single supervised embedding model on labelled data from multiple well-resourced languages and then apply it to unseen zero-resource languages. For this transfer learning approach, we consider two multilingual recurrent neural network models: a discriminative classifier trained on the joint vocabularies of all training languages, and a correspondence autoencoder trained to reconstruct word pairs. We test these using a word discrimination task on six target zero-resource languages. When trained on seven well-resourced languages, both models perform similarly and outperform unsupervised models trained on the zero-resource languages. With just a single training language, the second model works better, but performance depends more on the particular training--testing language pair. 
    more » « less