Billion-scale Detection of Isomorphic Nodes

Cappelletti, Luca; Fontana, Tommaso; Reese, Justin; Bader, David A.

doi:10.1109/IPDPSW59300.2023.00046

Citation Details

Billion-scale Detection of Isomorphic Nodes

This paper presents an algorithm for detecting attributed high-degree node isomorphism. High-degree isomorphic nodes seldom happen by chance and often represent duplicated entities or data processing errors. By definition, isomorphic nodes are topologically indistinguishable and can be problematic in graph ML tasks. The algorithm employs a parallel, “degree-bounded” approach that fingerprints each node’s local properties through a hash, which constrains the search to nodes within hash-defined buckets, thus minimising the number of comparisons. This method scales on graphs with billions of nodes and edges. Finally, we provide isomorphic node oddities identified in real-world data. more »

Award ID(s):: 2109988

PAR ID:: 10477184

Author(s) / Creator(s):: Cappelletti, Luca; Fontana, Tommaso; Reese, Justin; Bader, David A.

Publisher / Repository:: IEEE

Date Published:: 2023-05-01

ISBN:: 979-8-3503-1199-0

Page Range / eLocation ID:: 230 to 233

Format(s):: Medium: X

Location:: St. Petersburg, FL, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/IPDPSW59300.2023.00046

More Like this