skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Montague Grammar Induction
We propose a computational model for inducing full-fledged combinatory categorial grammars from behavioral data. This model contrasts with prior computational models of selection in representing syntactic and semantic types as structured (rather than atomic) objects, enabling direct interpretation of the modeling results relative to standard formal frameworks. We investigate the grammar our model induces when fit to a lexicon-scale acceptability judgment dataset – Mega Acceptability – focusing in particular on the types our model assigns to clausal complements and the predicates that select them.  more » « less
Award ID(s):
1748969
PAR ID:
10264975
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Semantics and Linguistic Theory
Volume:
30
ISSN:
2163-5951
Page Range / eLocation ID:
227; 251
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We propose a computational modeling framework for inducing combinatory categorial grammars from arbitrary behavioral data. This framework provides the analyst fine-grained control over the assumptions that the induced grammar should conform to: (i) what the primitive types are; (ii) how complex types are constructed; (iii) what set of combinators can be used to combine types; and (iv) whether (and to what) the types of some lexical items should be fixed. In a proof-of-concept experiment, we deploy our framework for use in distributional analysis. We focus on the relationship between s(emantic)-selection and c(ategory)-selection, using as input a lexicon-scale acceptability judgment dataset focused on English verbs’ syntactic distribution (the MegaAcceptability dataset) and enforcing standard assumptions from the semantics literature on the induced grammar. 
    more » « less
  2. Abstract. Monte Carlo (MC) methods have been widely used in uncertainty analysis and parameter identification for hydrological models. The main challenge with these approaches is, however, the prohibitive number of model runs required to acquire an adequate sample size, which may take from days to months – especially when the simulations are run in distributed mode. In the past, emulators have been used to minimize the computational burden of the MC simulation through direct estimation of the residual-based response surfaces. Here, we apply emulators of an MC simulation in parameter identification for a distributed conceptual hydrological model using two likelihood measures, i.e. the absolute bias of model predictions (Score) and another based on the time-relaxed limits of acceptability concept (pLoA). Three machine-learning models (MLMs) were built using model parameter sets and response surfaces with a limited number of model realizations (4000). The developed MLMs were applied to predict pLoA and Score for a large set of model parameters (95 000). The behavioural parameter sets were identified using a time-relaxed limits of acceptability approach, based on the predicted pLoA values, and applied to estimate the quantile streamflow predictions weighted by their respective Score. The three MLMs were able to adequately mimic the response surfaces directly estimated from MC simulations with an R2 value of 0.7 to 0.92. Similarly, the models identified using the coupled machine-learning (ML) emulators and limits of acceptability approach have performed very well in reproducing the median streamflow prediction during the calibration and validation periods, with an average Nash–Sutcliffe efficiency value of 0.89 and 0.83, respectively. 
    more » « less
  3. The aim of this paper is to study the optimal investment problem by using coherent acceptability indices (CAIs) as a tool to measure the portfolio performance. We call this problem the acceptability maximization. First, we study the one-period (static) case, and propose a numerical algorithm that approximates the original problem by a sequence of risk minimization problems. The results are applied to several important CAIs, such as the gain-to-loss ratio, the risk-adjusted return on capital and the tail-value-at-risk based CAI. In the second part of the paper we investigate the acceptability maximization in a discrete time dynamic setup. Using robust representations of CAIs in terms of a family of dynamic coherent risk measures (DCRMs), we establish an intriguing dichotomy: if the corresponding family of DCRMs is recursive (i.e. strongly time consistent) and assuming some recursive structure of the market model, then the acceptability maximization problem reduces to just a one period problem and the maximal acceptability is constant across all states and times. On the other hand, if the family of DCRMs is not recursive, which is often the case, then the acceptability maximization problem ordinarily is a time-inconsistent stochastic control problem, similar to the classical mean-variance criteria. To overcome this form of time-inconsistency, we adapt to our setup the set-valued Bellman's principle recently proposed in [23] applied to two particular dynamic CAIs - the dynamic risk-adjusted return on capital and the dynamic gain-to-loss ratio. The obtained theoretical results are illustrated via numerical examples that include, in particular, the computation of the intermediate mean-risk efficient frontiers. 
    more » « less
  4. Mobile devices use language models to suggest words and phrases for use in text entry. Traditional language models are based on contextual word frequency in a static corpus of text. However, certain types of phrases, when offered to writers as suggestions, may be systematically chosen more often than their frequency would predict. In this paper, we propose the task of generating suggestions that writers accept, a related but distinct task to making accurate predictions. Although this task is fundamentally interactive, we propose a counterfactual setting that permits offline training and evaluation. We find that even a simple language model can capture text characteristics that improve acceptability. 
    more » « less
  5. This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence. We introduce the Corpus of Linguistic Acceptability (CoLA), a set of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. As baselines, we train several recurrent neural network models on acceptability classification, and find that our models outperform unsupervised models by Lau et al. (2016) on CoLA. Error-analysis on specific grammatical phenomena reveals that both Lau et al.’s models and ours learn systematic generalizations like subject-verb-object order. However, all models we test perform far below human level on a wide range of grammatical constructions. 
    more » « less