  1. We investigated how gender is represented in children’s books using a novel 200,000 word corpus comprising 247 popular, contemporary books for young children. Using human judgments and word co-occurrence data, we quantified gender biases of words in individual books and in the whole corpus. We find that children’s books contain many words that adults judge as gendered. Semantic analyses based on co-occurrence data yielded word clusters related to gender stereotypes (e.g., feminine: emotions; masculine: tools). Co-occurrence data also indicate that many books instantiate gender stereotypes identified in other research (e.g., girls are better at reading and boys at math). Finally,more »we used large-scale data to estimate the gender distribution of the audience for individual books, and find that children are more often exposed to gender stereotypes for their own gender. Together the data suggest that children’s books may be an early source of gender associations and stereotypes.« less
    Free, publicly-accessible full text available January 1, 2023
  2. We asked whether categories expressed through lists of salient exemplars (e.g., car, truck, boat, etc.) convey the same meaning as categories expressed through conventional superordinate nouns (e.g., vehicles). We asked English speakers to list category members, with one group given superordinate labels like vehicles and the other group given only a list of salient exemplars. We found that the responses of the group given labels were more related, more typical, and less diverse than the responses of the group given exemplars. This result suggests that when people do not see a superordinate label, the categories that they infer are lessmore »well aligned across participants. In addition, categories inferred based on exemplars may be broader in general than categories given by superordinate labels.« less
    Free, publicly-accessible full text available July 1, 2022
  3. How related is skin to a quilt or door to worry? Here, we show that linguistic experience strongly informs people’s judgments of such word pairs. We asked Chinese-speakers, English-speakers, and Chinese-English bilinguals to rate semantic and visual similarity between pairs of Chinese words and of their English translation equivalents. Some pairs were unrelated, others were also unrelated but shared a radical (e.g., “expert” and “dolphin” share the radical meaning “pig”), others also shared a radical which invokes a metaphorical relationship. For example, a quilt covers the body like skin; understand, with a sun radical, invokes understanding as illumination. Importantly, themore »shared radicals are not part of the pronounced word form. Chinese speakers rated word pairs with metaphorical connections as more similar than other pairs. English speakers did not even though they were sensitive to shared radicals. Chinese-English bilinguals showed sensitivity to the metaphorical connections even when tested with English words.« less
    Free, publicly-accessible full text available July 1, 2022
  4. Certain colors are strongly associated with certain adjectives (e.g. red is hot, blue is cold). Some of these associations are grounded in visual experiences like seeing hot embers glow red. Surprisingly, many congenitally blind people show similar color associations, despite lacking all visual experience of color. Presumably, they learn these associations via language. Can we detect these associations in the statistics of language? And if so, what form do they take? We apply a projection method to word embeddings trained on corpora of spoken and written text to identify color-adjective associations as they are represented in language. We show thatmore »these projections are predictive of color-adjective ratings collected from blind and sighted people, and that the effect size depends on the training corpus. Finally, we examine how color-adjective associations might be represented in language by training word embeddings on corpora from which various sources of color-semantic information are removed.« less
    Free, publicly-accessible full text available July 1, 2022
  5. The study of mental representations of concepts has histori- cally focused on the representations of the “average” person. Here, we shift away from this aggregate view and examine the principles of variability across people in conceptual rep- resentations. Using a database of millions of sketches by peo- ple worldwide, we ask what predicts whether people converge or diverge in their representations of a specific concept, and which kinds of concepts tend to be more or less variable. We find that larger and more dense populations tend to have less variable representations, and concepts high in valence and arousal tend tomore »be less variable across people. Further, two countries tend to have people with more similar conceptual representations when they are linguistically, geographically, and culturally similar. Our work provides the first characteri- zation of the principles of variability in shared meaning across a large, diverse sample of participants.« less
    Free, publicly-accessible full text available July 1, 2022
  6. Denison, S. ; Mack, M. ; Xu, Y. ; Armstrong, B.C. (Ed.)
    Do people perceive shapes to be similar based purely on their physical features? Or is visual similarity influenced by top-down knowledge? In the present studies, we demonstrate that top-down information – in the form of verbal labels that people associate with visual stimuli – predicts visual similarity as measured using subjective (Experiment 1) and objective (Experiment 2) tasks. In Experiment 1, shapes that were previously calibrated to be (putatively) perceptually equidistant were more likely to be grouped together if they shared a name. In Experiment 2, more nameable shapes were easier for participants to discriminate from other images, again controllingmore »for their perceptual distance. We discuss what these results mean for constructing visual stimuli spaces that are perceptually uniform and discuss theoretical implications of the fact that perceptual similarity is sensitive to top-down information such as the ease with which an object can be named.« less
  7. We estimate lexical Concreteness for millions of words across 77 languages. Using a simple regression framework, we combine vector-based models of lexical semantics with experimental norms of Concreteness in English and Dutch. By applying techniques to align vector-based semantics across distinct languages, we compute and release Concreteness esti- mates at scale in numerous languages for which experimental norms are not currently available. This paper lays out the technique and its efficacy. Although this is a difficult dataset to evaluate immediately, Concreteness estimates computed from English correlate with Dutch experimental norms at ρ = .75 in the vocabulary at large, increasingmore »to ρ = .8 among Nouns. Our predictions also recapitulate attested relationships with word frequency. The approach we describe can be readily applied to numerous lexical measures beyond Concreteness.« less
  8. Does the lexicon of a language have consequences for cognition? Here, we provide evidence that the ease with which category features can be named can influence category learning. Across two experiments, participants learned to distinguish images composed of colors (Experiment 1) and shapes (Experiment 2) that were either easy or more difficult to name in English. Holding the category structure constant, when the underlying features of the category were easy to name, participants were faster and more accurate in learning the novel category. We argue that these findings suggest that labels allow learners to form more compact hypotheses, which inmore »turn can be confirmed or disconfirmed in the course of learning. These results have consequences for considering how cross-linguistic differences in lexical inventory affect how readily novel categories are learned.« less
  9. A foundational assumption of linguistic communication is that conversants have similar underlying concepts (Brennan & Clark, 1996; Wierzbicka, 2012). On this view, the ability of one person to understand another when she says “the tree” depends on the word activating the same concept in both people. One approach to verifying this assumption is to rely on definitions, but this reasoning is circular— how can we be sure the words in our definitions are the same? Here, we investigate the assumption of shared linguistic concepts by studying concepts represented in the visual modality—drawings—and examining predictors of their variability. Specifically, we askmore »whether people who are geographically closer and inhabit a similar linguistic environment produce more similar drawings.« less
  10. Many people report experiencing their thoughts in the form of natural language, i.e., they experience ‘inner speech’. At present, there exist few ways of quantifying this tendency, making it difficult to investigate whether the propensity to experience verbalize predicts objective cognitive function or whether it is merely epiphenomenal. We present a new instrument—The Internal Representation Questionnaire (IRQ)—for quantifying the subjective format of internal thoughts. The primary goal of the IRQ is to assess whether people vary in their stated use of visual and verbal strategies in their internal representations. Exploratory analyses revealed four factors: Propensity to form visual images, verbalmore »images, a general mental manipulation factor, and an orthographic imagery factor. Here, we describe the properties of the IRQ and report an initial test of its predictive validity by relating it to a speeded picture/word verification task involving pictorial, written, and auditory verbal cues.« less