skip to main content

Title: Representing Joint Hierarchies with Box Embeddings
Learning representations for hierarchical and multi-relational knowledge has emerged as an active area of research. Box Embeddings [Vilnis et al., 2018, Li et al., 2019] represent concepts with hyperrectangles in -dimensional space and are shown to be capable of modeling tree-like structures efficiently by training on a large subset of the transitive closure of the WordNet hypernym graph. In this work, we evaluate the capability of box embeddings to learn the transitive closure of a tree-like hierarchical relation graph with far fewer edges from the transitive closure. Box embeddings are not restricted to tree-like structures, however, and we demonstrate this by modeling the WordNet meronym graph, where nodes may have multiple parents. We further propose a method for modeling multiple relations jointly in a single embedding space using box embeddings. In all cases, our proposed method outperforms or is at par with all other embedding methods.
Authors:
Award ID(s):
1763618
Publication Date:
NSF-PAR ID:
10188900
Journal Name:
Automated Knowledge Base Construction
Sponsoring Org:
National Science Foundation
More Like this
  1. Low-dimensional node embeddings play a key role in analyzing graph datasets. However, little work studies exactly what information is encoded by popular embedding methods, and how this information correlates with performance in downstream machine learning tasks. We tackle this question by studying whether embeddings can be inverted to (approximately) recover the graph used to generate them. Focusing on a variant of the popular DeepWalk method (Perozzi et al., 2014; Qiu et al., 2018), we present algorithms for accurate embedding inversion - i.e., from the low-dimensional embedding of a graph G, we can find a graph H with a very similarmore »embedding. We perform numerous experiments on real-world networks, observing that significant information about G, such as specific edges and bulk properties like triangle density, is often lost in H. However, community structure is often preserved or even enhanced. Our findings are a step towards a more rigorous understanding of exactly what information embeddings encode about the input graph, and why this information is useful for learning tasks.« less
  2. Hierarchical relations are prevalent and indispensable for organizing human knowledge captured by a knowledge graph (KG). The key property of hierarchical relations is that they induce a partial ordering over the entities, which needs to be modeled in order to allow for hierarchical reasoning. However, current KG embeddings can model only a single global hierarchy (single global partial ordering) and fail to model multiple heterogeneous hierarchies that exist in a single KG. Here we present ConE (Cone Embedding), a KG embedding model that is able to simultaneously model multiple hierarchical as well as non-hierarchical relations in a knowledge graph. ConEmore »embeds entities into hyperbolic cones and models relations as transformations between the cones. In particular, ConE uses cone containment constraints in different subspaces of the hyperbolic embedding space to capture multiple heterogeneous hierarchies. Experiments on standard knowledge graph benchmarks show that ConE obtains state-of-the-art performance on hierarchical reasoning tasks as well as knowledge graph completion task on hierarchical graphs. In particular, our approach yields new state-of-the-art Hits@1 of 45.3% on WN18RR and 16.1% on DDB14 (0.231 MRR). As for hierarchical reasoning task, our approach outperforms previous best results by an average of 20% across the three datasets.« less
  3. ord embeddings are commonly used to measure word-level semantic similarity in text, especially in direct word- to-word comparisons. However, the relationships between words in the embedding space are often viewed as approximately linear and concepts comprised of multiple words are a sort of linear combination. In this paper, we demonstrate that this is not generally true and show how the relationships can be better captured by leveraging the topology of the embedding space. We propose a technique for directly computing new vectors representing multiple words in a way that naturally combines them into a new, more consistent space where distancemore »better correlates to similarity. We show that this technique works well for natural language, even when it comprises multiple words, on a simple task derived from WordNet synset descriptions and examples of words. Thus, the generated vectors better represent complex concepts in the word embedding space.« less
  4. This paper first proposes a method of formulating model interpretability in visual understanding tasks based on the idea of unfolding latent structures. It then presents a case study in object detection using popular two-stage region-based convolutional neural network (i.e., R-CNN) detection systems. The proposed method focuses on weakly-supervised extractive rationale generation, that is learning to unfold latent discriminative part configurations of object instances automatically and simultaneously in detection without using any supervision for part configurations. It utilizes a top-down hierarchical and compositional grammar model embedded in a directed acyclic AND-OR Graph (AOG) to explore and unfold the space of latentmore »part configurations of regions of interest (RoIs). It presents an AOGParsing operator that seamlessly integrates with the RoIPooling /RoIAlign operator widely used in R-CNN and is trained end-to-end. In object detection, a bounding box is interpreted by the best parse tree derived from the AOG on-the-fly, which is treated as the qualitatively extractive rationale generated for interpreting detection. In experiments, Faster R-CNN is used to test the proposed method on the PASCAL VOC 2007 and the COCO 2017 object detection datasets. The experimental results show that the proposed method can compute promising latent structures without hurting the performance.« less
  5. The transitive closure of a graph is a new graph where every vertex is directly connected to all vertices to which it had a path in the original graph. Transitive closures are useful for reachability and relationship querying. Finding the transitive closure can be computationally expensive and requires a large memory footprint as the output is typically larger than the input. Some of the original research on transitive closures assumed that graphs were dense and used dense adjacency matrices. We have since learned that many real-world networks are extremely sparse, and the existing methods do not scale. In this work,more »we introduce a new algorithm called Anti-section Transitive Closure (ATC) for finding the transitive closure of a graph. We present a new parallel edges operation – anti-sections – for finding new edges to reachable vertices. ATC scales to massively multithreaded systems such as NVIDIA’s GPU with tens of thousands of threads. We show that the anti-section operation shares some traits with the triangle counting intersection operation in graph analysis. Lastly, we view the transitive closure problem as a dynamic graph problem requiring edge insertions. By doing this, our memory footprint is smaller. We also show a method for creating the batches in parallel using two different techniques: dual-round and hash. Using these techniques and the Hornet dynamic graph data structure, we show our new algorithm on an NVIDIA Titan V GPU. We compare with other packages such as NetworkX, SEI-GBTL, SuiteSparse, and cuSparse.« less