skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Structure and Complexity of Bag Consistency
Since the early days of relational databases, it was realized that acyclic hypergraphs give rise to database schemas with desirable structural and algorithmic properties. In a bynow classical paper, Beeri, Fagin, Maier, and Yannakakis established several different equivalent characterizations of acyclicity; in particular, they showed that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency property for relations over that schema holds, which means that every collection of pairwise consistent relations over the schema is globally consistent. Even though real-life databases consist of bags (multisets), there has not been a study of the interplay between local consistency and global consistency for bags. We embark on such a study here and we first show that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency property for bags over that schema holds. After this, we explore algorithmic aspects of global consistency for bags by analyzing the computational complexity of the global consistency problem for bags: given a collection of bags, are these bags globally consistent? We show that this problem is in NP, even when the schema is part of the input. We then establish the following dichotomy theorem for fixed schemas: if the schema is acyclic, then the global consistency problem for bags is solvable in polynomial time, while if the schema is cyclic, then the global consistency problem for bags is NP-complete. The latter result contrasts sharply with the state of affairs for relations, where, for each fixed schema, the global consistency problem for relations is solvable in polynomial time.  more » « less
Award ID(s):
1814152
PAR ID:
10358313
Author(s) / Creator(s):
;
Date Published:
Journal Name:
ACM SIGMOD Record
Volume:
51
Issue:
1
ISSN:
0163-5808
Page Range / eLocation ID:
78 to 85
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The interplay between local consistency and global consistency has been the object of study in several different areas, including probability theory, relational databases, and quantum information. For relational databases, Beeri, Fagin, Maier, and Yannakakis showed that a database schema is acyclic if and only if it has the local-to-global consistency property for relations, which means that every collection of pairwise consistent relations over the schema is globally consistent. More recently, the same result has been shown under bag semantics. In this paper, we carry out a systematic study of local vs. global consistency for relations over positive commutative monoids, which is a common generalization of ordinary relations and bags. Let K be an arbitrary positive commutative monoid. We begin by showing that acyclicity of the schema is a necessary condition for the local-to-global consistency property for K-relations to hold. Unlike the case of ordinary relations and bags, however, we show that acyclicity is not always sufficient. After this, we characterize the positive commutative monoids for which acyclicity is both necessary and sufficient for the local-to-global consistency property to hold; this characterization involves a combinatorial property of monoids, which we call the transportation property. We then identify several different classes of monoids that possess the transportation property. As our final contribution, we introduce a modified notion of local consistency of K-relations, which we call pairwise consistency up to the free cover. We prove that, for all positive commutative monoids K, even those without the transportation property, acyclicity is both necessary and sufficient for every family of K-relations that is pairwise consistent up to the free cover to be globally consistent. 
    more » « less
  2. For graphs G and H, we say that G is H-free if it does not contain H as an induced subgraph. Already in the early 1980s Alekseev observed that if H is connected, then the Max Weight Independent Set problem (MWIS) remains NP-hard in H-free graphs, unless H is a path or a subdivided claw, i.e., a graph obtained from the three-leaf star by subdividing each edge some number of times (possibly zero). Since then determining the complexity of MWIS in these remaining cases is one of the most important problems in algorithmic graph theory. A general belief is that the problem is polynomial-time solvable, which is witnessed by algorithmic results for graphs excluding some small paths or subdivided claws. A more conclusive evidence was given by the recent breakthrough result by Gartland and Lokshtanov [FOCS 2020]: They proved that MWIS can be solved in quasipolynomial time in H-free graphs, where H is any fixed path. If H is an arbitrary subdivided claw, we know much less: The problem admits a QPTAS and a subexponential-time algorithm [Chudnovsky et al., SODA 2019]. In this paper we make an important step towards solving the problem by showing that for any subdivided claw H, MWIS is polynomial-time solvable in H-free graphs of bounded degree. 
    more » « less
  3. Graph Isomorphism (GI) is one of a small number of natural algorithmic problems with unsettled complexity status in the P / NP theory: not expected to be NP-complete, yet not known to be solvable in polynomial time. Arguably, the GI problem boils down to filling the gap between symmetry and regularity, the former being defined in terms of automorphisms, the latter in terms of equations satisfied by numerical parameters. Recent progress on the complexity of GI relies on a combination of the asymptotic theory of permutation groups and asymptotic properties of highly regular combinatorial structures called coherent configurations. Group theory provides the tools to infer either global symmetry or global irregularity from local information, eliminating the symmetry/regularity gap in the relevant scenario; the resulting global structure is the subject of combinatorial analysis. These structural studies are melded in a divide-and-conquer algorithmic framework pioneered in the GI context by Eugene M. Luks (1980). 
    more » « less
  4. In this work we advance the understanding of the fundamental limits of computation for Binary Polynomial Optimization (BPO), which is the problem of maximizing a given polynomial function over all binary points. In our main result we provide a novel class of BPO that can be solved efficiently both from a theoretical and computational perspective. In fact, we give a strongly polynomial-time algorithm for instances whose corresponding hypergraph is β-acyclic. We note that the β-acyclicity assumption is natural in several applications including relational database schemes and the lifted multicut problem on trees. Due to the novelty of our proving technique, we obtain an algorithm which is interesting also from a practical viewpoint. This is because our algorithm is very simple to implement and the running time is a polynomial of very low degree in the number of nodes and edges of the hypergraph. Our result completely settles the computational complexity of BPO over acyclic hypergraphs, since the problem is NP-hard on α-acyclic instances. Our algorithm can also be applied to any general BPO problem that contains β-cycles. For these problems, the algorithm returns a smaller instance together with a rule to extend any optimal solution of the smaller instance to an optimal solution of the original instance. 
    more » « less
  5. Event schemas are a form of world knowledge about the typical progression of events. Recent methods for event schema induction use information extraction systems to construct a large number of event graph instances from documents, and then learn to generalize the schema from such instances. In contrast, we propose to treat event schemas as a form of commonsense knowledge that can be derived from large language models (LLMs). This new paradigm greatly simplifies the schema induction process and allows us to handle both hierarchical relations and temporal relations between events in a straightforward way. Since event schemas have complex graph structures, we design an incremental prompting and verification method INCPROMPT to break down the construction of a complex event graph into three stages: event skeleton construction, event expansion, and event-event relation verification. Compared to directly using LLMs to generate a linearized graph, INCPROMPT can generate large and complex schemas with 7.2% F1 improvement in temporal relations and 31.0% F1 improvement in hierarchical relations. In addition, compared to the previous state-of-the-art closed-domain schema induction model, human assessors were able to cover ∼10% more events when translating the schemas into coherent stories and rated our schemas 1.3 points higher (on a 5-point scale) in terms of readability. 
    more » « less