A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings

Jiang, Song; Yao, Qiyue; Wang, Qifan; Sun, Yizhou

doi:10.1145/3543507.3583310

Citation Details

This content will become publicly available on April 30, 2024

A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings

Taxonomies, which organize knowledge hierarchically, support various practical web applications such as product navigation in online shopping and user profle tagging on social platforms. Given the continued and rapid emergence of new entities, maintaining a comprehensive taxonomy in a timely manner through human annotation is prohibitively expensive. Therefore, expanding a taxonomy automatically with new entities is essential. Most existing methods for expanding taxonomies encode entities into vector embeddings (i.e., single points). However, we argue that vectors are insufcient to model the “is-a” hierarchy in taxonomy (asymmetrical relation), because two points can only represent pairwise similarity (symmetrical relation). To this end, we propose to project taxonomy entities into boxes (i.e., hyperrectangles). Two boxes can be "contained", "disjoint" and "intersecting", thus naturally representing an asymmetrical taxonomic hierarchy. Upon box embeddings, we propose a novel model BoxTaxo for taxonomy expansion. The core of BoxTaxo is to learn boxes for entities to capture their child-parent hierarchies. To achieve this, BoxTaxo optimizes the box embeddings from a joint view of geometry and probability. BoxTaxo also ofers an easy and natural way for inference: examine whether the box of a given new entity is fully enclosed inside the box of a candidate parent from the existing taxonomy. Extensive experiments on two benchmarks demonstrate the efectiveness of BoxTaxo compared to vector based models. more »

Award ID(s):: 2211557 1937599

NSF-PAR ID:: 10464388

Author(s) / Creator(s):: Jiang, Song; Yao, Qiyue; Wang, Qifan; Sun, Yizhou

Date Published:: 2023-04-30

Journal Name:: Proceedings of the ACM Web Conference 2023 (WWW’23)

Page Range / eLocation ID:: 2467 to 2476

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on April 30, 2024
Conference Paper:
https://doi.org/10.1145/3543507.3583310

More Like this