skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Moving Definition Variables in Quantified Boolean Formulas
Augmenting problem variables in a quantified Boolean formula with definition variables enables a compact representation in clausal form. Generally these definition variables are placed in the innermost quantifier level. To restore some structural information, we introduce a preprocessing technique that moves definition variables to the quantifier level closest to the variables that define them. We express the movement in the QRAT proof system to allow verification by independent proof checkers. We evaluated definition variable movement on the QBFEVAL’20 competition benchmarks. Movement significantly improved performance for the competition’s top solvers. Combining variable movement with the preprocessor Bloqqer improves solver performance compared to using Bloqqer alone.  more » « less
Award ID(s):
2108521
PAR ID:
10319949
Author(s) / Creator(s):
; ;
Editor(s):
Fisman, D.; Rosu, G.
Date Published:
Journal Name:
Book cover International Conference on Tools and Algorithms for the Construction and Analysis of Systems
Volume:
13243
ISSN:
1611-3349
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Quantifier Raising leaves no overt marking to indicate movement has occurred, making the task of identifying when raising has occurred extremely difficult for the parser. Beyond this challenge, evidence from interpretation and judgement studies suggests that raising causes difficulty in processing. These two aspects taken together have led some to suggest that human sentence processor employs a strategy in which the construction of raised structures is avoided, commonly called processing scope economy. This contrasts to the traditional notion of grammatical scope economy, where Quantifier Raising is restricted in the grammar. In this paper we discuss the properties of these two theories. We conclude that the two approaches make different predictions about when raising should occur online, with processing scope economy predicting that the parser avoids raising whenever possible and grammatical scope economy predicting that the parser raises regularly and sometimes produces ungrammatical structures in the process. We then present an experiment which examines complex scope structures in verb phrase ellipsis to observe when penalties related to Quantifier Raising are observed online. We find that penalties appear in configurations where Quantifier Raising would be ungrammatical under grammatical scope economy, suggesting the parser attempts Quantifier Raising in these configurations. This evidence indicates that the parser’s behavior matches the predictions of grammatical scope economy rather than those of processing scope economy. 
    more » « less
  2. Abstract International large-scale assessments (ILSAs) play an important role in educational research and policy making. They collect valuable data on education quality and performance development across many education systems, giving countries the opportunity to share techniques, organisational structures, and policies that have proven efficient and successful. To gain insights from ILSA data, we identify non-cognitive variables associated with students’ academic performance. This problem has three analytical challenges: (a) academic performance is measured by cognitive items under a matrix sampling design; (b) there are many missing values in the non-cognitive variables; and (c) multiple comparisons due to a large number of non-cognitive variables. We consider an application to the Programme for International Student Assessment, aiming to identify non-cognitive variables associated with students’ performance in science. We formulate it as a variable selection problem under a general latent variable model framework and further propose a knockoff method that conducts variable selection with a controlled error rate for false selections. 
    more » « less
  3. Pollard, Tom J. (Ed.)
    Modern predictive models require large amounts of data for training and evaluation, absence of which may result in models that are specific to certain locations, populations in them and clinical practices. Yet, best practices for clinical risk prediction models have not yet considered such challenges to generalizability. Here we ask whether population- and group-level performance of mortality prediction models vary significantly when applied to hospitals or geographies different from the ones in which they are developed. Further, what characteristics of the datasets explain the performance variation? In this multi-center cross-sectional study, we analyzed electronic health records from 179 hospitals across the US with 70,126 hospitalizations from 2014 to 2015. Generalization gap, defined as difference between model performance metrics across hospitals, is computed for area under the receiver operating characteristic curve (AUC) and calibration slope. To assess model performance by the race variable, we report differences in false negative rates across groups. Data were also analyzed using a causal discovery algorithm “Fast Causal Inference” that infers paths of causal influence while identifying potential influences associated with unmeasured variables. When transferring models across hospitals, AUC at the test hospital ranged from 0.777 to 0.832 (1st-3rd quartile or IQR; median 0.801); calibration slope from 0.725 to 0.983 (IQR; median 0.853); and disparity in false negative rates from 0.046 to 0.168 (IQR; median 0.092). Distribution of all variable types (demography, vitals, and labs) differed significantly across hospitals and regions. The race variable also mediated differences in the relationship between clinical variables and mortality, by hospital/region. In conclusion, group-level performance should be assessed during generalizability checks to identify potential harms to the groups. Moreover, for developing methods to improve model performance in new environments, a better understanding and documentation of provenance of data and health processes are needed to identify and mitigate sources of variation. 
    more » « less
  4. Abstract Ponds are often identified by their small size and shallow depths, but the lack of a universal evidence-based definition hampers science and weakens legal protection. Here, we compile existing pond definitions, compare ecosystem metrics (e.g., metabolism, nutrient concentrations, and gas fluxes) among ponds, wetlands, and lakes, and propose an evidence-based pond definition. Compiled definitions often mentioned surface area and depth, but were largely qualitative and variable. Government legislation rarely defined ponds, despite commonly using the term. Ponds, as defined in published studies, varied in origin and hydroperiod and were often distinct from lakes and wetlands in water chemistry. We also compared how ecosystem metrics related to three variables often seen in waterbody definitions: waterbody size, maximum depth, and emergent vegetation cover. Most ecosystem metrics (e.g., water chemistry, gas fluxes, and metabolism) exhibited nonlinear relationships with these variables, with average threshold changes at 3.7 ± 1.8 ha (median: 1.5 ha) in surface area, 5.8 ± 2.5 m (median: 5.2 m) in depth, and 13.4 ± 6.3% (median: 8.2%) emergent vegetation cover. We use this evidence and prior definitions to define ponds as waterbodies that are small (< 5 ha), shallow (< 5 m), with < 30% emergent vegetation and we highlight areas for further study near these boundaries. This definition will inform the science, policy, and management of globally abundant and ecologically significant pond ecosystems. 
    more » « less
  5. Mycielski graphs are a family of triangle-free graphs 𝑀_𝑘 with arbitrarily high chromatic number. 𝑀_𝑘 has chromatic number k and there is a short informal proof of this fact, yet finding proofs of it via automated reasoning techniques has proved to be a challenging task. In this paper, we study the complexity of clausal proofs of the uncolorability of 𝑀_𝑘 with 𝑘−1 colors. In particular, we consider variants of the PR (propagation redundancy) proof system that are without new variables, and with or without deletion. These proof systems are of interest due to their potential uses for proof search. As our main result, we present a sublinear-length and constant-width PR proof without new variables or deletion. We also implement a proof generator and verify the correctness of our proof. Furthermore, we consider formulas extended with clauses from the proof until a short resolution proof exists, and investigate the performance of CDCL in finding the short proof. This turns out to be difficult for CDCL with the standard heuristics. Finally, we describe an approach inspired by SAT sweeping to find proofs of these extended formulas. 
    more » « less