skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A tale of two Tagalogs
A well-received generalization in Tagalog is that only the argument that is cross-referenced by voice is eligible for A-bar extraction. However, recent work has shown that agents that are not cross-referenced by voice are also eligible. We provide naturally occurring data, along with experimental evidence, consistent with this more permissive picture. Further, we present computational evidence that participants were treating agent-extractions not cross-referenced by voice categorically, that is, they were either accepting or rejecting them in any given trial. Thus, we identify a piece of grammatical knowledge (i.e., extraction) that is systematic within an individual speaker but varies unpredictably across a population of Tagalog speakers. In other words, our data reveal two separable types of Tagalog speakers vis-à-vis extraction. We propose that this is a form of grammar competition that arises via the idea that the agent-first bias affects how child learners parse input strings under noisy conditions during acquisition.  more » « less
Award ID(s):
2204112
PAR ID:
10556629
Author(s) / Creator(s):
;
Publisher / Repository:
Glossa
Date Published:
Journal Name:
Glossa: a journal of general linguistics
Volume:
9
Issue:
1
ISSN:
2397-1835
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Systemic inequity in biometrics systems based on racial and gender disparities has received a lot of attention recently. These disparities have been explored in existing biometrics systems such as facial biometrics (identifying individuals based on facial attributes). However, such ethical issues remain largely unexplored in voice biometric systems that are very popular and extensively used globally. Using a corpus of non-speech voice records featuring a diverse group of 300 speakers by race (75 each from White, Black, Asian, and Latinx subgroups) and gender (150 each from female and male subgroups), we explore and reveal that racial subgroup has a similar voice characteristic and gender subgroup has a significant different voice characteristic. Moreover, non-negligible racial and gender disparities exist in speaker identification accuracy by analyzing the performance of one commercial product and five research products. The average accuracy for Latinxs can be 12% lower than Whites (p < 0.05, 95% CI 1.58%, 14.15%) and can be significantly higher for female speakers than males (3.67% higher, p < 0.05, 95% CI 1.23%, 11.57%). We further discover that racial disparities primarily result from the neural network-based feature extraction within the voice biometric product and gender disparities primarily due to both voice inherent characteristic difference and neural network-based feature extraction. Finally, we point out strategies (e.g., feature extraction optimization) to incorporate fairness and inclusive consideration in biometrics technology. 
    more » « less
  2. Abstract This paper presents and analyzes antipassive constructions in the Mayan language Kaqchikel. Through various syntactic tests, we show that antipassive constructions differ from both active transitive and Agent Focus structures in that they do not syntactically project a DP-sized object. Thus, we should think of antipassives as a type of unergative. When an object seems to disappear or become less important in an antipassive, this is not a special feature of antipassives – it is simply what happens in any intransitive structure. In other words, the ‘suppression’ or ‘demotion’ of thematic object is not an inherent characteristic of the construction but rather a byproduct of its intransitive nature. To better understand how transitive and intransitive constructions function cross-linguistically, we propose a novel framework for categorizing the functional heads v and Voice. We show that the external argument behaves differently in transitive versus intransitive clauses, appearing in different structural positions, which is backed up by evidence from causatives in Kaqchikel and scope patterns in other languages. While transitive and passive structures include a Voice projection, Agent Focus and antipassive structures do not. We compare our analysis to previous work on antipassives and explore what our findings might mean for understanding antipassives in other languages. 
    more » « less
  3. Language in autism is heterogeneous, with a significant proportion of individuals having structural language difficulties and inclusion of language impairment as a specifier under Diagnostic and Statistical Manual of Mental Disorders (5th ed.) criteria for autism. This systematic review asked: What are the reporting patterns of variables pertaining to structural language in autism prior to and after publication of the Diagnostic and Statistical Manual of Mental Disorders (5th ed.)? What norm-referenced assessments does research use to characterize the language abilities of autistic individuals with respect to language impairment? This preregistered review (PROSPERO: CRD42021260394) followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Searches took place in September 2022 and included Linguistics and Language Behavior Abstracts, PsycINFO, PubMed, and the Directory of Open Access Journals. Search terms included three essential concepts: autism, language, and age. Two coders independently screened and evaluated articles. Searches yielded 57 qualifying studies, with mostly consistent reporting practices prior to and after the Diagnostic and Statistical Manual of Mental Disorders (5th ed.). Studies varied in how they defined language groups and in what norm-referenced measures they used. Interpreting research on structural language in autism requires attention to diagnostic and grouping criteria. Although inconsistency in reporting in original studies limited this review, better understanding the available information on structural language in autistic individuals aged 3–21 years may support identification of language needs. Lay abstractUnder the Diagnostic and Statistical Manual of Mental Disorders (5th ed.), language impairment can co-occur with autism. It is not yet clear how research defines, reports, and characterizes structural language abilities of autistic individuals eligible for school-based special education services (aged 3–21 years) in the United States. In the United States, students typically must be formally diagnosed to be eligible for services and supports. However, the quality of diagnosis is only as good as the research evidence on which diagnosis depends. To evaluate evidence quality, we examined how studies of school-aged autistic individuals report assessments of language ability. This systematic review included 57 studies using English language age-referenced assessments used to measure structural language. Findings showed many differences across studies in how language abilities were measured and reported. Also, none of the studies fully reported the variables relevant to characterizing language impairment. Outcomes were similar across versions of the Diagnostic and Statistical Manual of Mental Disorders. Findings indicate that researchers and clinicians should pay attention to reporting diagnostic and grouping criteria. Carefully interpreting research evidence is critical for ensuring that diagnostic criteria and supports are representative of and accessible to autistic individuals and relevant parties. 
    more » « less
  4. Abstract When large-scale assessment programs are developed and administered in a particular language, students from other native language backgrounds may experience considerable barriers to appropriate measurement of the targeted knowledge and skills. Empirical work is needed to determine if one of the most commonly-applied accommodations to address language barriers, namely extended test time limits, corresponds to score comparability for students who use it. Prior work has examined score comparability for English learners (ELs) eligible to use extended time on tests in the United States, but not specifically for those who more specifically show evidence of using the accommodation. NAEP process data were used to explore score comparability for two groups of ELs eligible for extended time: those who used extended time and those who did not. Analysis of differential item functioning (DIF) was applied to examine potential item bias for these groups when compared to a reference group of native English speakers. Items showing significant and large DIF were identified in both comparisons, with slightly more DIF items identified for the comparison involving ELs who used extended time. Item location and word counts were examined for those items displaying DIF, with results showing some alignment with the notion that language-related barriers may be present for ELs even when extended time is used. Overall, results point to a need for ongoing consideration of the unique needs of ELs during large-scale testing, and the opportunities test process data offer for more comprehensive analyses of accommodation use and effectiveness. 
    more » « less
  5. Extensive recent research has shown that it is surprisingly easy to infer Amazon Alexa voice commands over their network traffic data. To prevent these traffic analytics (TA)-based inference attacks, smart home owners are considering deploying virtual private networks (VPNs) to safeguard their smart speakers. In this work, we design a new machine learning-powered attack framework—VoiceAttack that could still accurately fingerprint voice commands on VPN-encrypted voice speaker network traffic. We evaluate VoiceAttack under 5 different real-world settings using Amazon Alexa and Google Home. Our results show that VoiceAttack could correctly infer voice command sentences with a Matthews Correlation Coefficient (MCC) of 0.68 in a closed-world setting and infer voice command categories with an MCC of 0.84 in an open-world setting by eavesdropping VPN-encrypted network traffic data. This presents a significant risk to user privacy and security, as it suggests that external on-path attackers could still potentially intercept and decipher users’ voice commands despite the VPN encryption. We then further examine the sensitivity of voice speaker commands to VoiceAttack. We find that 134 voice speaker commands are highly vulnerable to VoiceAttack. We also present a defense approach—VoiceDefense, which could inject inject appropriate traffic “noise” into voice speaker traffic. And our evaluation results show that VoiceDefense could effectively mitigate VoiceAttack on Amazon Echo and Google Home. 
    more » « less