Traditionally, many text-mining tasks treat individual word-tokens as the finest meaningful semantic granularity. However, in many languages and specialized corpora, words are composed by concatenating semantically meaningful subword structures. Word-level analysis cannot leverage the semantic information present in such subword structures. With regard to word embedding techniques, this leads to not only poor embeddings for infrequent words in long-tailed text corpora but also weak capabilities for handling out-of-vocabulary words. In this paper we propose MorphMine for unsupervised morpheme segmentation. MorphMine applies a parsimony criterion to hierarchically segment words into the fewest number of morphemes at each level of the hierarchy. This leads to longer shared morphemes at each level of segmentation. Experiments show that MorphMine segments words in a variety of languages into human-verified morphemes. Additionally, we experimentally demonstrate that utilizing MorphMine morphemes to enrich word embeddings consistently improves embedding quality on a variety of of embedding evaluations and a downstream language modeling task.
more »
« less
A meta-analytic review of morphological priming in Semitic languages
Abstract Two types of discontinuous morphemes are thought to be the basic building blocks of words in Semitic languages: roots and templates. However, the role of these morphemes in lexical access and representation is debated. Priming experiments, where reaction times to target words are predicted to be faster when preceded by morphologically-related primes compared to unrelated control primes, provide conflicting evidence bearing on this debate. We used meta-analysis to synthesise the findings from 229 priming experiments on 4710 unique Semitic speakers. With Bayesian modelling of the aggregate effect sizes, we found credible root and template priming in both nouns and verbs in Arabic and Hebrew. Our results show that root priming effects can be distinguished from the effects of overlap in form and meaning. However, more experiments are needed to determine if template priming effects can be distinguished from overlap in form and morphosyntactic function.
more »
« less
- Award ID(s):
- 2214017
- PAR ID:
- 10531846
- Publisher / Repository:
- John Benjamins
- Date Published:
- Journal Name:
- The Mental Lexicon
- Volume:
- 18
- Issue:
- 2
- ISSN:
- 1871-1340
- Page Range / eLocation ID:
- 300 to 337
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose that emotional priming may be an effective approach to scaffold the creation of rich stories. There are relatively few emotion-based approaches to support users to create, instead of consume, rich stories. Emotional priming is the technique of using emotion- related stimuli to affect human’s executive control and affective processing. It has been researched mostly in terms of human’s behaviors and decision making. We conducted a within-subjects study with 12 participants to investigate the effects of emotional priming induced through an interactive application on storytelling quality. Two conditions of priming were compared to a baseline condition of no priming. In the first condition, the application primes participants by having asking them to perceive and recognize varying emotional stimuli (perception-based priming). In the second condition, the application primes participants by having them produce varying emotional facial expressions (production- based priming). Analyses show that emotional priming resulted in richer storytelling than no emotional priming, and that the production-based emotional priming condition resulted in statistically richer stories being told by participants. We discuss the possibility of integrating interactive emotional priming into storytelling applications.more » « less
-
Corina, David P. (Ed.)Letter recognition plays an important role in reading and follows different phases of processing, from early visual feature detection to the access of abstract letter representations. Deaf ASL–English bilinguals experience orthography in two forms: English letters and fingerspelling. However, the neurobiological nature of fingerspelling representations, and the relationship between the two orthographies, remains unexplored. We examined the temporal dynamics of single English letter and ASL fingerspelling font processing in an unmasked priming paradigm with centrally presented targets for 200 ms preceded by 100 ms primes. Event-related brain potentials were recorded while participants performed a probe detection task. Experiment 1 examined English letter-to-letter priming in deaf signers and hearing non-signers. We found that English letter recognition is similar for deaf and hearing readers, extending previous findings with hearing readers to unmasked presentations. Experiment 2 examined priming effects between English letters and ASL fingerspelling fonts in deaf signers only. We found that fingerspelling fonts primed both fingerspelling fonts and English letters, but English letters did not prime fingerspelling fonts, indicating a priming asymmetry between letters and fingerspelling fonts. We also found an N400-like priming effect when the primes were fingerspelling fonts which might reflect strategic access to the lexical names of letters. The studies suggest that deaf ASL–English bilinguals process English letters and ASL fingerspelling differently and that the two systems may have distinct neural representations. However, the fact that fingerspelling fonts can prime English letters suggests that the two orthographies may share abstract representations to some extent.more » « less
-
Gong, Y.; Kpogo, F. (Ed.)In acquiring morphology, the language learner faces the challenge of identifying both the form of morphemes and their location within words. For example, individuals acquiring Chamorro (Austronesian) must learn an agreement morpheme with the form -um- that is infixed before the first vowel of the stem (1a). This challenge is more difficult when a morpheme has multiple forms and/or locations: in some varieties of Chamorro, the same agreement morpheme appears as mu- prefixed on verbs beginning with a nasal/liquid consonant (1b). The learner could potentially overcome the acquisition challenge by employing strong inductive biases. This hypothesis is consistent with the typological finding that, across languages, morphemes occupy a restricted set of prosodically-defined locations (Yu, 2007) and there are strong correlations between morpheme form and position (Anderson, 1972). We conducted a series of artificial morphology experiments, modeled after the Chamorro pattern, that provide converging evidence for such inductive biases (Pierrehumbert & Nair, 1995; Staroverov & Finley, 2021).more » « less
-
We examined how phonological competition effects in spoken word recognition change with word length. Cohort effects (competition between words that overlap at onset) are strong and easily replicated. Rhyme effects (competition between words that mismatch at onset) are weaker, emerge later in the time course of spoken word recognition, and are more difficult to replicate. We conducted a simple experiment to examine cohort and rhyme competition using monosyllabic vs. bisyllabic words. Degree of competition was predicted by proportion of phonological overlap. Longer rhymes, with greater overlap in both number and proportion of shared phonemes, compete more strongly (e.g., kettle-medal [0.8 overlap] vs. cat-mat [0.67 overlap]). In contrast, long and short cohort pairs constrained to have constant (2-phoneme) overlap vary in proportion of overlap. Longer cohort pairs (e.g., camera-candle) have lower proportion of overlap (in this example, 0.33) than shorter cohorts (e.g., cat-can, with 0.67 overlap) and compete more weakly. This finding has methodological implications (rhyme effects are less likely to be observed with shorter words, while cohort effects are diminished for longer words), but also theoretical implications: degree of competition is not a simple function of overlapping phonemes; degree of competition is conditioned on proportion of overlap. Simulations with TRACE help explicate how this result might emerge.more » « less
An official website of the United States government

