Abstract Sign languages are human communication systems that are equivalent to spoken language in their capacity for information transfer, but which use a dynamic visual signal for communication. Thus, linguistic metrics of complexity, which are typically developed for linear, symbolic linguistic representation (such as written forms of spoken languages) do not translate easily into sign language analysis. A comparison of physical signal metrics, on the other hand, is complicated by the higher dimensionality (spatial and temporal) of the sign language signal as compared to a speech signal (solely temporal). Here, we review a variety of approaches to operationalizing sign language complexity based on linguistic and physical data, and identify the approaches that allow for high fidelity modeling of the data in the visual domain, while capturing linguistically-relevant features of the sign language signal.
more »
« less
Overlapping semantic representations of sign and speech in novice sign language learners.
- Award ID(s):
- 1822819
- PAR ID:
- 10380567
- Date Published:
- Journal Name:
- Proceedings of the Annual Meeting of the Cognitive Science Society.
- Volume:
- 44
- Page Range / eLocation ID:
- 3346-3353
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Deaf signers who wish to communicate in their native language frequently share videos on the Web. However, videos cannot preserve privacy—as is often desirable for discussion of sensitive topics—since both hands and face convey critical linguistic information and therefore cannot be obscured without degrading communication. Deaf signers have expressed interest in video anonymization that would preserve linguistic content. However, attempts to develop such technology have thus far shown limited success. We are developing a new method for such anonymization, with input from ASL signers. We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with preservation of linguistically significant motions and facial expressions. An asymmetric encoder-decoder structured image generator is used to generate the high-resolution target frame from the low-resolution source frame based on the optical flow and confidence map. We explicitly guide the model to attain clear generation of hands and face by using bounding boxes to improve the loss computation. FID and KID scores are used for evaluation of the realism of the generated frames. This technology shows great potential for practical applications to benefit deaf signers.more » « less
An official website of the United States government

