skip to main content


Title: Conceptual categories scaffold verbal semantic structure: A cross-cultural study of child homesign
In language evolution, formation of conceptual categories preceded formation of linguistic semantic categories (Hurford, 2007). The mapping from concepts to semantics is non-isomorphic, however, as particular languages categorize conceptual space in divergent ways (e.g. English put in is acceptable for both tight-fit and loose-fit relations, while Korean kkita encodes tight-fit relationships only; Choi & Bowerman, 1991). Despite this variation, are there crosslinguistic patterns in how words lexicalize conceptual space? We address this question analyzing how child homesigners from four different cultures describe instrumental events (e.g. cutting bread with a knife). Homesigners are congenitally deaf individuals who have not been taught a signed language. Despite growing up without structured linguistic input, these individuals use a gestural system ("homesign") to communicate (Goldin-Meadow, 2003). We find that homesign descriptions of instrumental events reflect categories present in adult English, Spanish and Mandarin, suggesting crosslinguistic biases for how verbs encode the conceptual space of events, biases which may have been present over the course of language evolution.  more » « less
Award ID(s):
1654154
NSF-PAR ID:
10089630
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 12th International Conference on the Evolution of Language
Page Range / eLocation ID:
405
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Perlman, Marcus (Ed.)
    Longstanding cross-linguistic work on event representations in spoken languages have argued for a robust mapping between an event’s underlying representation and its syntactic encoding, such that–for example–the agent of an event is most frequently mapped to subject position. In the same vein, sign languages have long been claimed to construct signs that visually represent their meaning, i.e., signs that are iconic. Experimental research on linguistic parameters such as plurality and aspect has recently shown some of them to be visually universal in sign, i.e. recognized by non-signers as well as signers, and have identified specific visual cues that achieve this mapping. However, little is known about what makes action representations in sign language iconic, or whether and how the mapping of underlying event representations to syntactic encoding is visually apparent in the form of a verb sign. To this end, we asked what visual cues non-signers may use in evaluating transitivity (i.e., the number of entities involved in an action). To do this, we correlated non-signer judgments about transitivity of verb signs from American Sign Language (ASL) with phonological characteristics of these signs. We found that non-signers did not accurately guess the transitivity of the signs, but that non-signer transitivity judgments can nevertheless be predicted from the signs’ visual characteristics. Further, non-signers cue in on just those features that code event representations across sign languages, despite interpreting them differently. This suggests the existence of visual biases that underlie detection of linguistic categories, such as transitivity, which may uncouple from underlying conceptual representations over time in mature sign languages due to lexicalization processes. 
    more » « less
  2. Logical properties such as negation, implication, and symmetry, despite the fact that they are foundational and threaded through the vocabulary and syntax of known natural languages, pose a special problem for language learning. Their meanings are much harder to identify and isolate in the child’s everyday interaction with referents in the world than concrete things (like spoons and horses) and happenings and acts (like running and jumping) that are much more easily identified, and thus more easily linked to their linguistic labels (spoon, horse, run, jump). Here we concentrate attention on the category of symmetry [a relation R is symmetrical if and only if (iff) for all x, y: if R ( x, y), then R (y, x)], expressed in English by such terms as similar, marry, cousin, and near. After a brief introduction to how symmetry is expressed in English and other well-studied languages, we discuss the appearance and maturation of this category in Nicaraguan Sign Language (NSL). NSL is an emerging language used as the primary, daily means of communication among a population of deaf individuals who could not acquire the surrounding spoken language because they could not hear it, and who were not exposed to a preexisting sign language because there was none available in their community. Remarkably, these individuals treat symmetry, in both semantic and syntactic regards, much as do learners exposed to a previously established language. These findings point to deep human biases in the structures underpinning and constituting human language. 
    more » « less
  3. The PoseASL dataset consists of color and depth videos collected from ASL signers at the Linguistic and Assistive Technologies Laboratory under the direction of Matt Huenerfauth, as part of a collaborative research project with researchers at the Rochester Institute of Technology, Boston University, and the University of Pennsylvania. Access: After becoming an authorized user of Databrary, please contact Matt Huenerfauth if you have difficulty accessing this volume. We have collected a new dataset consisting of color and depth videos of fluent American Sign Language signers performing sequences ASL signs and sentences. Given interest among sign-recognition and other computer-vision researchers in red-green-blue-depth (RBGD) video, we release this dataset for use by the research community. In addition to the video files, we share depth data files from a Kinect v2 sensor, as well as additional motion-tracking files produced through post-processing of this data. Organization of the Dataset: The dataset is organized into sub-folders, with codenames such as "P01" or "P16" etc. These codenames refer to specific human signers who were recorded in this dataset. Please note that there was no participant P11 nor P14; those numbers were accidentally skipped during the process of making appointments to collect video stimuli. Task: During the recording session, the participant was met by a member of our research team who was a native ASL signer. No other individuals were present during the data collection session. After signing the informed consent and video release document, participants responded to a demographic questionnaire. Next, the data-collection session consisted of English word stimuli and cartoon videos. The recording session began with showing participants stimuli consisting of slides that displayed English word and photos of items, and participants were asked to produce the sign for each (PDF included in materials subfolder). Next, participants viewed three videos of short animated cartoons, which they were asked to recount in ASL: - Canary Row, Warner Brothers Merrie Melodies 1950 (the 7-minute video divided into seven parts) - Mr. Koumal Flies Like a Bird, Studio Animovaneho Filmu 1969 - Mr. Koumal Battles his Conscience, Studio Animovaneho Filmu 1971 The word list and cartoons were selected as they are identical to the stimuli used in the collection of the Nicaraguan Sign Language video corpora - see: Senghas, A. (1995). Children’s Contribution to the Birth of Nicaraguan Sign Language. Doctoral dissertation, Department of Brain and Cognitive Sciences, MIT. Demographics: All 14 of our participants were fluent ASL signers. As screening, we asked our participants: Did you use ASL at home growing up, or did you attend a school as a very young child where you used ASL? All the participants responded affirmatively to this question. A total of 14 DHH participants were recruited on the Rochester Institute of Technology campus. Participants included 7 men and 7 women, aged 21 to 35 (median = 23.5). All of our participants reported that they began using ASL when they were 5 years old or younger, with 8 reporting ASL use since birth, and 3 others reporting ASL use since age 18 months. Filetypes: *.avi, *_dep.bin: The PoseASL dataset has been captured by using a Kinect 2.0 RGBD camera. The output of this camera system includes multiple channels which include RGB, depth, skeleton joints (25 joints for every video frame), and HD face (1,347 points). The video resolution produced in 1920 x 1080 pixels for the RGB channel and 512 x 424 pixels for the depth channels respectively. Due to limitations in the acceptable filetypes for sharing on Databrary, it was not permitted to share binary *_dep.bin files directly produced by the Kinect v2 camera system on the Databrary platform. If your research requires the original binary *_dep.bin files, then please contact Matt Huenerfauth. *_face.txt, *_HDface.txt, *_skl.txt: To make it easier for future researchers to make use of this dataset, we have also performed some post-processing of the Kinect data. To extract the skeleton coordinates of the RGB videos, we used the Openpose system, which is capable of detecting body, hand, facial, and foot keypoints of multiple people on single images in real time. The output of Openpose includes estimation of 70 keypoints for the face including eyes, eyebrows, nose, mouth and face contour. The software also estimates 21 keypoints for each of the hands (Simon et al, 2017), including 3 keypoints for each finger, as shown in Figure 2. Additionally, there are 25 keypoints estimated for the body pose (and feet) (Cao et al, 2017; Wei et al, 2016). Reporting Bugs or Errors: Please contact Matt Huenerfauth to report any bugs or errors that you identify in the corpus. We appreciate your help in improving the quality of the corpus over time by identifying any errors. Acknowledgement: This material is based upon work supported by the National Science Foundation under award 1749376: "Collaborative Research: Multimethod Investigation of Articulatory and Perceptual Constraints on Natural Language Evolution." 
    more » « less
  4. null (Ed.)
    Interest in physical therapy and individual exercises such as yoga/dance has increased alongside the well-being trend, and people globally enjoy such exercises at home/office via video streaming platforms. However, such exercises are hard to follow without expert guidance. Even if experts can help, it is almost impossible to give personalized feedback to every trainee remotely. Thus, automated pose correction systems are required more than ever, and we introduce a new captioning dataset named FixMyPose to address this need. We collect natural language descriptions of correcting a “current” pose to look like a “target” pose. To support a multilingual setup, we collect descriptions in both English and Hindi. The collected descriptions have interesting linguistic properties such as egocentric relations to the environment objects, analogous references, etc., requiring an understanding of spatial relations and commonsense knowledge about postures. Further, to avoid ML biases, we maintain a balance across characters with diverse demographics, who perform a variety of movements in several interior environments (e.g., homes, offices). From our FixMyPose dataset, we introduce two tasks: the pose-correctional-captioning task and its reverse, the target-pose-retrieval task. During the correctional-captioning task, models must generate the descriptions of how to move from the current to the target pose image, whereas in the retrieval task, models should select the correct target pose given the initial pose and the correctional description. We present strong cross-attention baseline models (uni/multimodal, RL, multilingual) and also show that our baselines are competitive with other models when evaluated on other image-difference datasets. We also propose new task-specific metrics (object-match, body-part-match, direction-match) and conduct human evaluation for more reliable evaluation, and we demonstrate a large human-model performance gap suggesting room for promising future work. Finally, to verify the sim-to-real transfer of our FixMyPose dataset, we collect a set of real images and show promising performance on these images. Data and code are available: https://fixmypose-unc.github.io. 
    more » « less
  5. Sandler, W. ; Aronoff, M. ; Padden, C. (Ed.)
    Variation in the linguistic use of handshapes exists across sign languages, but it is unclear how these iconic handshape preferences arise and become conventionalized. In order to understand the factors that shape such handshape preferences in the earliest stages of language emergence, we examined communication within family homesign systems. Homesigners are deaf individuals who have not acquired a signed or spoken language and who innovate unique gesture systems to communicate with hearing friends and family (“communication partners”). We analyzed how characteristics of participants and stimulus items influence handshape preferences and conventionalization. Participants included 11 deaf homesigners, 24 hearing communication partners (CPs), and 8 hearing non-signing adults from Nicaragua. Participants were asked to label items using gestures or signs. The handshape type (Handling, Object, or combined Handling+Object) was then coded. The participants and groups showed variability in iconic handshape preferences. Adult homesigners’ families demonstrated more conventionalization than did child homesigners’ families. Adult homesigners also used a combined Handling + Object form more than other participants. Younger CPs and those with fewer years of experience using a homesign system showed greater conventionalization. Items that elicited a reliable handshape preference were more likely to elicit Handling rather than Object handshapes. These findings suggest that similarity in terms of handshape type varies even within families, including hearing gesturers in the same culture. Although adult homesigners’ families were more conventionalized than child homesigners’ families, full conventionalization of these handshape preferences do not seem to appear reliably within two to three decades of use in a family when only one deaf homesigner uses it as a primary system. 
    more » « less