skip to main content

Title: FASTEN: Fast Sylvester Equation Solver for Graph Mining
The Sylvester equation offers a powerful and unifying primitive for a variety of important graph mining tasks, including network alignment, graph kernel, node similarity, subgraph matching, etc. A major bottleneck of Sylvester equation lies in its high computational complexity. Despite tremendous effort, state-of-the-art methods still require a complexity that is at least \em quadratic in the number of nodes of graphs, even with approximations. In this paper, we propose a family of Krylov subspace based algorithms (\fasten) to speed up and scale up the computation of Sylvester equation for graph mining. The key idea of the proposed methods is to project the original equivalent linear system onto a Kronecker Krylov subspace. We further exploit (1) the implicit representation of the solution matrix as well as the associated computation, and (2) the decomposition of the original Sylvester equation into a set of inter-correlated Sylvester equations of smaller size. The proposed algorithms bear two distinctive features. First, they provide the \em exact solutions without any approximation error. Second, they significantly reduce the time and space complexity for solving Sylvester equation, with two of the proposed algorithms having a \em linear complexity in both time and space. Experimental evaluations on a diverse set of real networks, demonstrate that our methods (1) are up to $10,000\times$ faster against Conjugate Gradient method, the best known competitor that outputs the exact solution, and (2) scale up to million-node graphs.  more » « less
Award ID(s):
1651203 1715385 1947135 2003924
Author(s) / Creator(s):
Date Published:
Journal Name:
KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Page Range / eLocation ID:
1339 to 1347
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Attributed subgraph matching is a powerful tool for explorative mining of large attributed networks. In many applications (e.g., network science of teams, intelligence analysis, finance informatics), the user might not know what exactly s/he is looking for, and thus require the user to constantly revise the initial query graph based on what s/he finds from the current matching results. A major bottleneck in such an interactive matching scenario is the efficiency, as simply rerunning the matching algorithm on the revised query graph is computationally prohibitive. In this paper, we propose a family of effective and efficient algorithms (FIRST) to support interactive attributed subgraph matching. There are two key ideas behind the proposed methods. The first is to recast the attributed subgraph matching problem as a cross-network node similarity problem, whose major computation lies in solving a Sylvester equation for the query graph and the underlying data graph. The second key idea is to explore the smoothness between the initial and revised queries, which allows us to solve the new/updated Sylvester equation incrementally, without re-solving it from scratch. Experimental results show that our method can achieve (1) up to 16x speed-up when applying on networks with 6M$+$ nodes; (2) preserving more than 90% accuracy compared with existing methods; and (3) scales linearly with respect to the size of the data graph. 
    more » « less
  2. null (Ed.)
    How can we identify the same or similar users from a collection of social network platforms (e.g., Facebook, Twitter, LinkedIn, etc.)? Which restaurant shall we recommend to a given user at the right time at the right location? Given a disease, which genes and drugs are most relevant? Multi-way association, which identifies strongly correlated node sets from multiple input networks, is the key to answering these questions. Despite its importance, very few multi-way association methods exist due to its high complexity. In this paper, we formulate multi-way association as a convex optimization problem, whose optimal solution can be obtained by a Sylvester tensor equation. Furthermore, we propose two fast algorithms to solve the Sylvester tensor equation, with a linear time and space complexity. We further provide theoretic analysis in terms of the sensitivity of the Sylvester tensor equation solution. Empirical evaluations demonstrate the efficacy of the proposed method. 
    more » « less
  3. The past decades have witnessed the prosperity of graph mining, with a multitude of sophisticated models and algorithms designed for various mining tasks, such as ranking, classification, clustering and anomaly detection. Generally speaking, the vast majority of the existing works aim to answer the following question, that is, given a graph, what is the best way to mine it? In this paper, we introduce the graph sanitation problem, to an- swer an orthogonal question. That is, given a mining task and an initial graph, what is the best way to improve the initially provided graph? By learning a better graph as part of the input of the mining model, it is expected to benefit graph mining in a variety of settings, ranging from denoising, imputation to defense. We formulate the graph sanitation problem as a bilevel optimization problem, and fur- ther instantiate it by semi-supervised node classification, together with an effective solver named GaSoliNe. Extensive experimental results demonstrate that the proposed method is (1) broadly appli- cable with respect to various graph neural network models and flexible graph modification strategies, (2) effective in improving the node classification accuracy on both the original and contaminated graphs in various perturbation scenarios. In particular, it brings up to 25% performance improvement over the existing robust graph neural network methods. 
    more » « less
  4. Multi-sourced networks naturally appear in many application domains, ranging from bioinformatics, social networks, neuroscience to management. Although state-of-the-art offers rich models and algorithms to find various patterns when input networks are given, it has largely remained nascent on how vulnerable the mining results are due to the adversarial attacks. In this paper, we address the problem of attacking multi-network mining through the way of deliberately perturbing the networks to alter the mining results. The key idea of the proposed method (ADMIRING) is effective influence functions on the Sylvester equation defined over the input networks, which plays a central and unifying role in various multi-network mining tasks. The proposed algorithms bear two main advantages, including (1) effectiveness, being able to accurately quantify the rate of change of the mining results in response to attacks; and (2) generality, being applicable to a variety of multi-network mining tasks ( e.g., graph kernel, network alignment, cross-network node similarity) with different attacking strategies (e.g., edge/node removal, attribute alteration). 
    more » « less
  5. In this paper, we propose a new spatial temperature aware transient EM induced stress analysis method. The new method consists of two new contributions: First, we propose a new TM-aware void saturation volume estimation method for fast immortality check in the post-voiding phase for the first time. We derive the analytic formula to estimate the void saturation in the presence of spatial temperature gradients due to Joule heating. Second, we developed a fast numerical solution for EM-induced stress analysis for multi-segment interconnect trees considering TM effect. The new method first transforms the coupled EM-TM partial differential equations into linear time-invariant ordinary differential equations (ODEs). Then extended Krylov subspace-based reduction technique is employed to reduce the size of the original system matrices so that they can be efficiently simulated in the time domain. The proposed method can perform the simulation process for both void nucleation and void growth phases under time-varying input currents and position-dependent temperatures. The numerical results show that, compared to the recently proposed semi-analytic EM-TM method, the proposed method can lead to about 28x speedup on average for the interconnect with up to 1000 branches for both void nucleation and growth phases with negligible errors. 
    more » « less