There is a lack of knowledge on the way transportation engineering practitioners engage with various Contextual Representations (CRs) to solve traffic engineering design problems. CRs such as equations, graphs, and tables could be perceived differently, even if they represent the same concept. The present study recognized left-turn treatment at signalized intersections as a prominent concept in traffic engineering practice and identified three associated CRs (a text-book equation, a graphical representation, and a stepwise flowchart) to design a phasing plan. Two data collection mechanisms were concurrently employed: 1) eye-tracking to analyze visual attention and document problem-solving approaches and 2) reflective clinical interviews to analyze ways of thinking and document problem-solving rationales. The problem-solving experiment was completed by twenty-four transportation engineering practitioners. Transportation engineering practitioners not only demonstrated preferences for different CRs, they also demonstrated different reasoning as to the selection of the same CR. Results of Multivariate Analysis of Variance showed that there was a statistically significant difference in visual attention based on CR. Additionally, in-vivo coding of participants’ interviews identified seven distinct rationales for CR selection. Findings from this study could be employed to modify transportation engineering curricula with optimized visual CRs.
Ecological diversity methods improve quantitative examination of student language in short constructed responses in STEM
We novelly applied established ecology methods to quantify and compare language diversity within a corpus of short written student texts. Constructed responses (CRs) are a common form of assessment but are difficult to evaluate using traditional methods of lexical diversity due to text length restrictions. Herein, we examined the utility of ecological diversity measures and ordination techniques to quantify differences in short texts by applying these methods in parallel to traditional text analysis methods to a corpus of previously studied college student CRs. The CRs were collected at two time points (Timing), from three types of higher-ed institutions (Type), and across three levels of student understanding (Thinking). Using previous work, we were able to predict that we would observe the most difference based on Thinking, then Timing and did not expect differences based on Type allowing us to test the utility of these methods for categorical examination of the corpus. We found that the ecological diversity metrics that compare CRs to each other (Whittaker’s beta, species turnover, and Bray–Curtis Dissimilarity) were informative and correlated well with our predicted differences among categories and other text analysis methods. Other ecological measures, including Shannon’s and Simpson’s diversity, measure the diversity of language within a single CR. Additionally, ordination provided meaningful visual representations of the corpus by reducing complex word frequency matrices to two-dimensional graphs. Using the ordination graphs, we were able to observe patterns in the CR corpus that further supported our predictions for the data set. This work establishes novel approaches to measuring language diversity within short texts that can be used to examine differences in student language and possible associations with categorical data.
more »
« less
- Award ID(s):
- 1660643
- PAR ID:
- 10409094
- Date Published:
- Journal Name:
- Frontiers in Education
- Volume:
- 8
- ISSN:
- 2504-284X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Euphemisms have not received much attention in natural language processing, despite being an important element of polite and figurative language. Euphemisms prove to be a difficult topic, not only because they are subject to language change, but also because humans may not agree on what is a euphemism and what is not. Nevertheless, the first step to tackling the issue is to collect and analyze examples of euphemisms. We present a corpus of potentially euphemistic terms (PETs) along with example texts from the GloWbE corpus. Additionally, we present a subcorpus of texts where these PETs are not being used euphemistically, which may be useful for future applications. We also discuss the results of multiple analyses run on the corpus. Firstly, we find that sentiment analysis on the euphemistic texts supports that PETs generally decrease negative and offensive sentiment. Secondly, we observe cases of disagreement in an annotation task, where humans are asked to label PETs as euphemistic or not in a subset of our corpus text examples. We attribute the disagreement to a variety of potential reasons, including if the PET was a commonly accepted term (CAT).more » « less
-
While data collection early in the Americanist tradition included texts as part of the Boasian triad, later developments in the generative tradition moved away from narratives. With a resurgence of attention to texts in both linguistic theory and language documentation, the literature on methodologies is growing (i.e., Chelliah 2001, Chafe 1980, Burton & Matthewson 2015). We outline our approach to collecting Chickasaw texts in what we call a ‘narrative bootcamp.’ Chickasaw is a severely threatened language and no longer in common daily use. Facilitating narrative collection with elder fluent speakers is an important goal, as is the cultivation of second language speakers and the training of linguists and tribal language professionals. Our bootcamps meet these goals. Moreover, we show many positive outcomes to this approach, including a positive sense of language use and ‘fun’ voiced by the elders, the corpus expansion that occurs by collecting and processing narratives onsite in the workshop, and field methods training for novices. Importantly, we find the sparking of personal recollections facilitates the collection of heretofore unrecorded narrative genres in Chickasaw. This approach offers an especially fruitful way to build and expand a text corpus for small communities of highly endangered languages.more » « less
-
null (Ed.)Abstract It is now a common practice to compare models of human language processing by comparing how well they predict behavioral and neural measures of processing difficulty, such as reading times, on corpora of rich naturalistic linguistic materials. However, many of these corpora, which are based on naturally-occurring text, do not contain many of the low-frequency syntactic constructions that are often required to distinguish between processing theories. Here we describe a new corpus consisting of English texts edited to contain many low-frequency syntactic constructions while still sounding fluent to native speakers. The corpus is annotated with hand-corrected Penn Treebank-style parse trees and includes self-paced reading time data and aligned audio recordings. We give an overview of the content of the corpus, review recent work using the corpus, and release the data.more » « less
-
null (Ed.)St. Lawrence Island Yupik (ISO 639-3: ess) is an endangered polysynthetic language in the Inuit-Yupik language family indigenous to Alaska and Chukotka. This work presents a step-by-step pipeline for the digitization of written texts, and the first publicly available digital corpus for St. Lawrence Island Yupik, created using that pipeline. This corpus has great potential for future linguistic inquiry and research in NLP. It was also developed for use in Yupik language education and revitalization, with a primary goal of enabling easy access to Yupik texts by educators and by members of the Yupik community. A secondary goal is to support development of language technology such as spell-checkers, text-completion systems, interactive e-books, and language learning apps for use by the Yupik community.more » « less