skip to main content


Title: Intensional Gaps: Relating veridicality, factivity, doxasticity, bouleticity, and neg-raising
We investigate which patterns of lexically triggered doxastic, bouletic, neg(ation)-raising, and veridicality inferences are (un)attested across clause-embedding verbs in English. To carry out this investigation, we use a multiview mixed effects mixture model to discover the inference patterns captured in three lexicon-scale inference judgment datasets: two existing datasets, MegaVeridicality and MegaNegRaising, which capture veridicality and neg-raising inferences across a wide swath of the English clause-embedding lexicon, and a new dataset, MegaIntensionality, which similarly captures doxastic and bouletic inferences. We focus in particular on inference patterns that are correlated with morphosyntactic distribution, as determined by how well those patterns predict the acceptability judgments in the MegaAcceptability dataset. We find that there are 15 such patterns attested. Similarities among these patterns suggest the possibility of underlying lexical semantic components that give rise to them. We use principal component analysis to discover these components and suggest generalizations that can be derived from them.  more » « less
Award ID(s):
1748969
NSF-PAR ID:
10333437
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Semantics and Linguistic Theory
Volume:
31
ISSN:
2163-5951
Page Range / eLocation ID:
570; 605
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We investigate which patterns of lexically triggered doxastic, bouletic, neg(ation)-raising, and veridicality inferences are (un)attested across clause-embedding verbs in English. To carry out this investigation, we use a multiview mixed effects mixture model to discover the inference patterns captured in three lexicon-scale inference judgment datasets: two existing datasets, MegaVeridicality and MegaNegRaising, which capture veridicality and neg-raising inferences across a wide swath of the English clause-embedding lexicon, and a new dataset, MegaIntensionality, which similarly captures doxastic and bouletic inferences. We focus in particular on inference patterns that are correlated with morphosyntactic distribution, as determined by how well those patterns predict the acceptability judgments in the MegaAcceptability dataset. We find that there are 15 such patterns attested. Similarities among these patterns suggest the possibility of underlying lexical semantic components that give rise to them. We use principal component analysis to discover these components and suggest generalizations that can be derived from them. 
    more » « less
  2. We investigate neg(ation)-raising inferences, wherein negation on a predicate can be interpreted as though in that predicate’s subordinate clause. To do this, we collect a largescale dataset of neg-raising judgments for effectively all English clause-embedding verbs and develop a model to jointly induce the semantic types of verbs and their subordinate clauses and the relationship of these types to neg-raising inferences. We find that some neg-raising inferences are attributable to properties of particular predicates, while others are attributable to subordinate clause structure. 
    more » « less
  3. We investigate neural models’ ability to capture lexicosyntactic inferences: inferences triggered by the interaction of lexical and syntactic information. We take the task of event factuality prediction as a case study and build a factuality judgment dataset for all English clause-embedding verbs in various syntactic contexts. We use this dataset, which we make publicly available, to probe the behavior of current state-of-the-art neural systems, showing that these systems make certain systematic errors that are clearly visible through the lens of factuality prediction. 
    more » « less
  4. Background Understanding how study design and monitoring strategies shape inference within, and synthesis across, studies is critical across biological disciplines. Many biological and field studies are short term and limited in scope. Monitoring studies are critical for informing public health about potential vectors of concern, such as Ixodes scapularis (black-legged ticks). Black-legged ticks are a taxon of ecological and human health concern due to their status as primary vectors of Borrelia burgdorferi , the bacteria that transmits Lyme disease. However, variation in black-legged tick monitoring, and gaps in data, are currently considered major barriers to understanding population trends and in turn, predicting Lyme disease risk. To understand how variable methodology in black-legged tick studies may influence which population patterns researchers find, we conducted a data synthesis experiment. Materials and Methods We searched for publicly available black-legged tick abundance dataset that had at least 9 years of data, using keywords about ticks in internet search engines, literature databases, data repositories and public health websites. Our analysis included 289 datasets from seven surveys from locations in the US, ranging in length from 9 to 24 years. We used a moving window analysis, a non-random resampling approach, to investigate the temporal stability of black-legged tick population trajectories across the US. We then used t-tests to assess differences in stability time across different study parameters. Results All of our sampled datasets required 4 or more years to reach stability. We also found several study factors can have an impact on the likelihood of a study reaching stability and of data leading to misleading results if the study does not reach stability. Specifically, datasets collected via dragging reached stability significantly faster than data collected via opportunistic sampling. Datasets that sampled larva reached stability significantly later than those that sampled adults or nymphs. Additionally, datasets collected at the broadest spatial scale (county) reached stability fastest. Conclusion We used 289 datasets from seven long term black-legged tick studies to conduct a non-random data resampling experiment, revealing that sampling design does shape inferences in black-legged tick population trajectories and how many years it takes to find stable patterns. Specifically, our results show the importance of study length, sampling technique, life stage, and geographic scope in understanding black-legged tick populations, in the absence of standardized surveillance methods. Current public health efforts based on existing black-legged tick datasets must take monitoring study parameters into account, to better understand if and how to use monitoring data to inform decisioning. We also advocate that potential future forecasting initiatives consider these parameters when projecting future black-legged tick population trends. 
    more » « less
  5. Abstract

    The world is facing a crisis of language loss that rivals, or exceeds, the rate of loss of biodiversity. There is an increasing urgency to understand the drivers of language change in order to try and stem the catastrophic rate of language loss globally and to improve language vitality. Here we present a unique case study of language shift in an endangered Indigenous language, with a dataset of unprecedented scale. We employ a novel multidimensional analysis, which allows the strength of a quantitative approach without sacrificing the detail of individual speakers and specific language variables, to identify social, cultural, and demographic factors that influence language shift in this community. We develop the concept of the ‘linguatype’, a sample of an individual’s language variants, analogous to the geneticists’ concept of ‘genotype’ as a sample of an individual’s genetic variants. We use multidimensional clustering to show that while family and household have significant effects on language patterns, peer group is the most significant factor for predicting language variation. Generalized linear models demonstrate that the strongest factor promoting individual use of the Indigenous language is living with members of the older generation who speak the heritage language fluently. Wright–Fisher analysis indicates that production of heritage language is lost at a significantly faster rate than perception, but there is no significant difference in rate of loss of verbs vs nouns, or lexicon vs grammar. Notably, we show that formal education has a negative relationship with Indigenous language retention in this community, with decreased use of the Indigenous language significantly associated with more years of monolingual schooling in English. These results suggest practical strategies for strengthening Indigenous language retention and demonstrate a new analytical approach to identifying risk factors for language loss in Indigenous communities that may be applicable to many languages globally.

     
    more » « less