skip to main content

Search for: All records

Award ID contains: 2020969

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We investigated how gender is represented in children’s books using a novel 200,000 word corpus comprising 247 popular, contemporary books for young children. Using human judgments and word co-occurrence data, we quantified gender biases of words in individual books and in the whole corpus. We find that children’s books contain many words that adults judge as gendered. Semantic analyses based on co-occurrence data yielded word clusters related to gender stereotypes (e.g., feminine: emotions; masculine: tools). Co-occurrence data also indicate that many books instantiate gender stereotypes identified in other research (e.g., girls are better at reading and boys at math). Finally,more »we used large-scale data to estimate the gender distribution of the audience for individual books, and find that children are more often exposed to gender stereotypes for their own gender. Together the data suggest that children’s books may be an early source of gender associations and stereotypes.« less
    Free, publicly-accessible full text available January 1, 2023
  2. Free, publicly-accessible full text available November 11, 2022
  3. We propose a structured extension to bidirectional-context conditional language generation, or “infilling,” inspired by Frame Semantic theory (Fillmore, 1976). Guidance is provided through two approaches: (1) model fine-tuning, conditioning directly on observed symbolic frames, and (2) a novel extension to disjunctive lexically constrained decoding that leverages frame semantic lexical units. Automatic and human evaluations confirm that frame-guided generation allows for explicit manipulation of intended infill semantics, with minimal loss in distinguishability from human-generated text. Our methods flexibly apply to a variety of use scenarios, and we provide an interactive web demo
    Free, publicly-accessible full text available August 1, 2022
  4. How related is skin to a quilt or door to worry? Here, we show that linguistic experience strongly informs people’s judgments of such word pairs. We asked Chinese-speakers, English-speakers, and Chinese-English bilinguals to rate semantic and visual similarity between pairs of Chinese words and of their English translation equivalents. Some pairs were unrelated, others were also unrelated but shared a radical (e.g., “expert” and “dolphin” share the radical meaning “pig”), others also shared a radical which invokes a metaphorical relationship. For example, a quilt covers the body like skin; understand, with a sun radical, invokes understanding as illumination. Importantly, themore »shared radicals are not part of the pronounced word form. Chinese speakers rated word pairs with metaphorical connections as more similar than other pairs. English speakers did not even though they were sensitive to shared radicals. Chinese-English bilinguals showed sensitivity to the metaphorical connections even when tested with English words.« less
    Free, publicly-accessible full text available July 1, 2022
  5. We asked whether categories expressed through lists of salient exemplars (e.g., car, truck, boat, etc.) convey the same meaning as categories expressed through conventional superordinate nouns (e.g., vehicles). We asked English speakers to list category members, with one group given superordinate labels like vehicles and the other group given only a list of salient exemplars. We found that the responses of the group given labels were more related, more typical, and less diverse than the responses of the group given exemplars. This result suggests that when people do not see a superordinate label, the categories that they infer are lessmore »well aligned across participants. In addition, categories inferred based on exemplars may be broader in general than categories given by superordinate labels.« less
    Free, publicly-accessible full text available July 1, 2022
  6. The study of mental representations of concepts has histori- cally focused on the representations of the “average” person. Here, we shift away from this aggregate view and examine the principles of variability across people in conceptual rep- resentations. Using a database of millions of sketches by peo- ple worldwide, we ask what predicts whether people converge or diverge in their representations of a specific concept, and which kinds of concepts tend to be more or less variable. We find that larger and more dense populations tend to have less variable representations, and concepts high in valence and arousal tend tomore »be less variable across people. Further, two countries tend to have people with more similar conceptual representations when they are linguistically, geographically, and culturally similar. Our work provides the first characteri- zation of the principles of variability in shared meaning across a large, diverse sample of participants.« less
    Free, publicly-accessible full text available July 1, 2022
  7. Certain colors are strongly associated with certain adjectives (e.g. red is hot, blue is cold). Some of these associations are grounded in visual experiences like seeing hot embers glow red. Surprisingly, many congenitally blind people show similar color associations, despite lacking all visual experience of color. Presumably, they learn these associations via language. Can we detect these associations in the statistics of language? And if so, what form do they take? We apply a projection method to word embeddings trained on corpora of spoken and written text to identify color-adjective associations as they are represented in language. We show thatmore »these projections are predictive of color-adjective ratings collected from blind and sighted people, and that the effect size depends on the training corpus. Finally, we examine how color-adjective associations might be represented in language by training word embeddings on corpora from which various sources of color-semantic information are removed.« less
    Free, publicly-accessible full text available July 1, 2022