Drawing language maps is not normally considered an important part of linguists’ work. Nonetheless, language maps influence their users’ perceptions and understandings of the characteristics of the languages that they represent. Therefore, given their communicative power, wide accessibility, and generalized use for educational purposes, attention must be paid as to what messages language maps convey about the languages that they visualize since different cartographic styles can be suited to representing some language ecologies better than others. However, decisions at this level are not normally made explicit by cartographers, and the ways in which certain ideologies surface in language maps can escape the attention of both linguists and cartographers alike. This article clarifies why these issues are especially relevant in a domain such as that of the study of Bantoid languages and proposes some novel cartographic models that have been used for representing the languages of Lower Fungom in western Cameroon. These include some cartographic strategies for the representation of the language ideologies of speaker communities and of individual multilingualism. The latter is both a key and under-researched feature in Bantoid sociolinguistics and the article suggests how scholars who are not sociolinguists may nevertheless contribute to its exploration.
more »
« less
Documentary linguists and risk communication: views from the virALLanguages project experience
Abstract Linguists are seldom, if ever, engaged in work aimed at communicating risk to the general public. The COVID-19 global pandemic and its associated infodemic may change this state of affairs, at least for documentary linguists. Documenting languages may bring researchers in direct contact with communities speaking minority or marginalized languages and gain key insights into their communicative ecologies. By being both immersed in local networks and more or less knowledgeable about the community’s communicative habits, documentary linguists appear to be placed in a unique position to contribute to communicating risk in ways that are better tailored to the community and, therefore, potentially quite effective locally. Furthermore, adding work in risk communication to their agenda may also stimulate documentary linguists to find new models for “giving back” to the communities they work with. In order to provide a concrete example of how all this may play out in concrete terms, we illustrate the virALLanguages project.
more »
« less
- Award ID(s):
- 1761639
- PAR ID:
- 10323578
- Date Published:
- Journal Name:
- Linguistics Vanguard
- Volume:
- 0
- Issue:
- 0
- ISSN:
- 2199-174X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Serikov, O.; Voloshina, E; Postnikova, A.; Klyachko, E.; Neminova E.; Vylomova, E.; Shavrina, T.; Le Ferrand, E.; Tyers, F (Ed.)In recent times, there has been a growing number of research studies focused on addressing the challenges posed by low-resource languages and the transcription bottleneck phenomenon. This phenomenon has driven the development of speech recognition methods to transcribe regional and Indigenous languages automatically. Although there is much talk about bridging the gap between speech technologies and field linguistics, there is a lack of documented efficient communication between NLP experts and documentary linguists. The models created for low-resource languages often remain within the confines of computer science departments, while documentary linguistics remain attached to traditional transcription workflows. This paper presents the early stage of a collaboration between NLP experts and field linguists, resulting in the successful transcription of Kréyòl Gwadloupéyen using speech recognition technology.more » « less
-
Abstract Commonly recommended methods for documenting endangered languages are built around the assumption that a given documentary project will focus on a single language rather than a multilingual ecology. This hinders the potential usability of documentary materials for the study of language contact. Research in domains such as ethnography and sociolinguistics has developed conceptual and analytical tools for understanding patterns of multilingual usage, but the insights of such work have yet to be translated into concrete recommendations for enhancements to documentary practice. This paper considers how standard documentary approaches can be adapted to multilingual contexts with respect to activities such as the collection of metadata, the use of ethnographic methods, and the recording and annotation of naturalistic multilingual discourse. A particular focus of the discussion are ways in which documentary projects can create better records of multilingual practices even if these are not the focus of the work.more » « less
-
Names for colors vary widely across languages, but color categories are remarkably consistent. Shared mechanisms of color perception help explain consistent partitions of visible light into discrete color vocabularies. But the mappings from colors to words are not identical across languages, which may reflect communicative needs—how often speakers must refer to objects of different color. Here we quantify the communicative needs of colors in 130 different languages by developing an inference algorithm for this problem. We find that communicative needs are not uniform: Some regions of color space exhibit 30-fold greater demand for communication than other regions. The regions of greatest demand correlate with the colors of salient objects, including ripe fruits in primate diets. Our analysis also reveals a hidden diversity in the communicative needs of colors across different languages, which is partly explained by differences in geographic location and the local biogeography of linguistic communities. Accounting for language-specific, nonuniform communicative needs improves predictions for how a language maps colors to words, and how these mappings vary across languages. Our account closes an important gap in the compression theory of color naming, while opening directions to study cross-cultural variation in the need to communicate different colors and its impact on the cultural evolution of color categories.more » « less
-
The data and compute requirements of current language modeling technology pose challenges for the processing and analysis of low-resource languages. Declarative linguistic knowledge has the potential to partially bridge this data scarcity gap by providing models with useful inductive bias in the form of language-specific rules. In this paper, we propose a retrieval augmented generation (RAG) framework backed by a large language model (LLM) to correct the output of a smaller model for the linguistic task of morphological glossing. We leverage linguistic information to make up for the lack of data and trainable parameters, while allowing for inputs from written descriptive grammars interpreted and distilled through an LLM. The results demonstrate that significant leaps in performance and efficiency are possible with the right combination of: a) linguistic inputs in the form of grammars, b) the interpretive power of LLMs, and c) the trainability of smaller token classification networks. We show that a compact, RAG-supported model is highly effective in data-scarce settings, achieving a new state-of-the-art for this task and our target languages. Our work also offers documentary linguists a more reliable and more usable tool for morphological glossing by providing well-reasoned explanations and confidence scores for each output.more » « less
An official website of the United States government

