skip to main content


Title: Ordered Tree Decomposition for HRG Rule Extraction
We present algorithms for extracting Hyperedge Replacement Grammar (HRG) rules from a graph along with a vertex order. Our algorithms are based on finding a tree decomposition of smallest width, relative to the vertex order, and then extracting one rule for each node in this structure. The assumption of a fixed order for the vertices of the input graph makes it possible to solve the problem in polynomial time, in contrast to the fact that the problem of finding optimal tree decompositions for a graph is NP-hard. We also present polynomial-time algorithms for parsing based on our HRGs, where the input is a vertex sequence and the output is a graph structure. The intended application of our algorithms is grammar extraction and parsing for semantic representation of natural language. We apply our algorithms to data annotated with Abstract Meaning Representations and report on the characteristics of the resulting grammars.  more » « less
Award ID(s):
1813823
NSF-PAR ID:
10228304
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Computational Linguistics
Volume:
45
Issue:
2
ISSN:
0891-2017
Page Range / eLocation ID:
339 to 379
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Vehicle routing problems are a broad class of combinatorial optimization problems that can be formulated as the problem of finding a tour in a weighted graph that optimizes some function of the visited vertices. For instance, a canonical and extensively studied vehicle routing problem is the orienteering problem where the goal is to find a tour that maximizes the number of vertices visited by a given deadline. In this paper, we consider the computational tractability of a well-known generalization of the orienteering problem called the Orient-MTW problem. The input to Orient-MTW consists of a weighted graph G(V, E) where for each vertex v ∊ V we are given a set of time instants Tv ⊆ [T], and a source vertex s. A tour starting at s is said to visit a vertex v if it transits through v at any time in the set Tv. The goal is to find a tour starting at the source vertex that maximizes the number of vertices visited. It is known that this problem admits a quasi-polynomial time O(log OPT)-approximation ratio where OPT is the optimal solution value but until now no hardness better than an APX-hardness was known for this problem. Our main result is an -hardness for this problem that holds even when the underlying graph G is an undirected tree. This is the first super-constant hardness result for the Orient-MTW problem. The starting point for our result is the hardness of the SetCover problem which is known to hold on instances with a special structure. We exploit this special structure of the hard SetCover instances to first obtain a new proof of the APX-hardness result for Orient-MTW that holds even on trees of depth 2. We then recursively amplify this constant factor hardness to an -hardness, while keeping the resulting topology to be a tree. Our amplified hardness proof crucially utilizes a delicate concavity property which shows that in our encoding of SetCover instances as instances of the Orient-MTW problem, whenever the optimal cost for SetCover instance is large, any tour, no matter how it allocates its time across different sub-trees, can not visit too many vertices overall. We believe that this reduction template may also prove useful in showing hardness of other vehicle routing problems. 
    more » « less
  2. Discovering the underlying structures present in large real world graphs is a fundamental scientific problem. Recent work at the intersection of formal language theory and graph theory has found that a Probabilistic Hyperedge Replacement Grammar (PHRG) can be extracted from a tree decomposition of any graph. However, because the extracted PHRG is directly dependent on the shape and contents of the tree decomposition, rather than from the dynamics of the graph, it is unlikely that informative graph-processes are actually being captured with the PHRG extraction algorithm. To address this problem, the current work adapts a related formalism called Probabilistic Synchronous HRG (PSHRG) that learns synchronous graph production rules from temporal graphs. We introduce the PSHRG model and describe a method to extract growth rules from the graph. We find that SHRG rules capture growth patterns found in temporal graphs and can be used to predict the future evolution of a temporal graph. We perform a brief evaluation on small synthetic networks that demonstrate the prediction accuracy of PSHRG versus baseline and state of the art models. Ultimately, we find that PSHRGs seem to be very good at modelling dynamics of a temporal graph; however, our prediction algorithm, which is based on string parsing and generation algorithms, does not scale to practically useful graph sizes. 
    more » « less
  3. Abstract

    In supersingular isogeny-based cryptography, the path-finding problem reduces to the endomorphism ring problem. Can path-finding be reduced to knowing just one endomorphism? It is known that a small degree endomorphism enables polynomial-time path-finding and endomorphism ring computation (in: Love and Boneh, ANTS XIV-Proceedings of the Fourteenth Algorithmic Number Theory Symposium, volume 4 of Open Book Ser. Math. Sci. Publ., Berkeley, 2020). An endomorphism gives an explicit orientation of a supersingular elliptic curve. In this paper, we use the volcano structure of the oriented supersingular isogeny graph to take ascending/descending/horizontal steps on the graph and deduce path-finding algorithms to an initial curve. Each altitude of the volcano corresponds to a unique quadratic order, called the primitive order. We introduce a new hard problem of computing the primitive order given an arbitrary endomorphism on the curve, and we also provide a sub-exponential quantum algorithm for solving it. In concurrent work (in: Wesolowski, Advances in cryptology-EUROCRYPT 2022, volume 13277 of Lecture Notes in Computer Science. Springer, Cham, 2022), it was shown that the endomorphism ring problem in the presence of one endomorphism with known primitive order reduces to a vectorization problem, implying path-finding algorithms. Our path-finding algorithms are more general in the sense that we don’t assume the knowledge of the primitive order associated with the endomorphism.

     
    more » « less
  4. Nowadays, large-scale graph data is being generated in a variety of real-world applications, from social networks to co-authorship networks, from protein-protein interaction networks to road traffic networks. Many existing works on graph mining focus on the vertices and edges, with the first-order Markov chain as the underlying model. They fail to explore the high-order network structures, which are of key importance in many high impact domains. For example, in bank customer personally identifiable information (PII) networks, the star structures often correspond to a set of synthetic identities; in financial transaction networks, the loop structures may indicate the existence of money laundering. In this paper, we focus on mining user-specified high-order network structures and aim to find a structure-rich subgraph which does not break many such structures by separating the subgraph from the rest. A key challenge associated with finding a structure-rich subgraph is the prohibitive computational cost. To address this problem, inspired by the family of local graph clustering algorithms for efficiently identifying a low-conductance cut without exploring the entire graph, we propose to generalize the key idea to model high-order network structures. In particular, we start with a generic definition of high-order conductance, and define the high-order diffusion core, which is based on a high-order random walk induced by user-specified high-order network structure. Then we propose a novel High-Order Structure-Preserving LOcal Cut (HOSPLOC) algorithm, which runs in polylogarithmic time with respect to the number of edges in the graph. It starts with a seed vertex and iteratively explores its neighborhood until a subgraph with a small high-order conductance is found. Furthermore, we analyze its performance in terms of both effectiveness and efficiency. The experimental results on both synthetic graphs and real graphs demonstrate the effectiveness and efficiency of our proposed HOSPLOC algorithm. 
    more » « less
  5. One of the principal goals of graph modeling is to capture the building blocks of network data in order to study various physical and natural phenomena. Recent work at the intersection of formal language theory and graph theory has explored the use of graph grammars for graph modeling. However, existing graph grammar formalisms, like Hyperedge Replacement Grammars, can only operate on small tree-like graphs. The present work relaxes this restriction by revising a different graph grammar formalism called Vertex Replacement Grammars (VRGs). We show that a variant of the VRG called Clustering-based Node Replacement Grammar (CNRG) can be efficiently extracted from many hierarchical clusterings of a graph. We show that CNRGs encode a succinct model of the graph, yet faithfully preserves the structure of the original graph. In experiments on large real-world datasets, we show that graphs generated from the CNRG model exhibit a diverse range of properties that are similar to those found in the original networks. 
    more » « less