<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>ADMIRING: Adversarial Multi-network Mining</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>11/01/2019</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10159293</idno>
					<idno type="doi">10.1109/ICDM.2019.00201</idno>
					<title level='j'>ICDM</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Qinghai Zhou</author><author>Liangyue Li</author><author>Nan Cao</author><author>Lei Ying</author><author>Hanghang Tong</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Multi-sourced networks naturally appear in manyapplication domains, ranging from bioinformatics, social networks,neuroscience to management. Although state-of-the-artoffers rich models and algorithms to find various patterns wheninput networks are given, it has largely remained nascent on howvulnerable the mining results are due to the adversarial attacks.In this paper, we address the problem of attacking multi-networkmining through the way of deliberately perturbing the networksto alter the mining results. The key idea of the proposed method(ADMIRING) is effective influence functions on the Sylvesterequation defined over the input networks, which plays a centraland unifying role in various multi-network mining tasks. Theproposed algorithms bear two main advantages, including (1)effectiveness, being able to accurately quantify the rate of changeof the mining results in response to attacks; and (2) generality,being applicable to a variety of multi-network mining tasks( e.g., graph kernel, network alignment, cross-network nodesimilarity) with different attacking strategies (e.g., edge/noderemoval, attribute alteration).]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Multi-sourced networks naturally appear in many highimpact application domains, ranging from bioinformatics, social networks, neuroscience to management. For example, for protein function prediction, a classic method is to assign similarity to protein pairs by applying graph kernel over multiple protein networks <ref type="bibr">[1]</ref>. For team management, <ref type="bibr">[2]</ref> proposes to replace the unavailable individual in the team by recommending the best candidate who maximizes the similarity of the team networks before and after the replacement (i.e., teamcontext aware similarity). For financial fraud detection, <ref type="bibr">[3]</ref> resorts to interactive subgraph matching to identify complex fraud schema, e.g., syntheticIDs, money laundry, etc.</p><p>To date, many sophisticated multi-network mining models and algorithms have been proposed (See Section V for a review). Although these methods are quite effective in identifying various patterns when the input networks are given, less is known on how the mining results would be affected by the perturbation of the underlying networks, due to either random noise (i.e., sensitivity analysis) or malicious attacks (i.e., adversarial learning). For example, although graph kernel is effective in predicting the function (i.e., labels) of a protein, it is not clear how sensitive the prediction result is due to the measurement error for certain molecule-molecule interactions (i.e., edge error).</p><p>In this paper, we address the problem of attacking multinetwork mining to alter its results, which we formulate as an optimization problem. We propose a family of algorithms 3 Remove edge <ref type="bibr">(2,</ref><ref type="bibr">4)</ref> Fig. <ref type="figure">1</ref>: Attacking graph kernel: given three networks G1, G2 and G3, G1 is much more similar to G2. By removing edge <ref type="bibr">(2,</ref><ref type="bibr">4)</ref>, the new network G 1 becomes more similar to G3 than it is to G2.</p><p>The key idea behind our method is to quantitatively characterize how the mining result will change if we deliberately perturb the networks, e.g., editing the network topology by removing edges/nodes or modifying attributes. To be specific, given the central and unifying role of the Sylvester equation defined over the input networks in a variety of multinetwork mining tasks (e.g., graph kernel, network alignment, etc.) <ref type="bibr">[4]</ref>, <ref type="bibr">[5]</ref>, we measure the influence of network elements (i.e., edge, node and attribute) as the rate of change of the mining results induced by the Sylvester equation. We summarize the main contributions of this paper as follows,</p><p>&#8226; Problem Formulation. We formally define the adversarial multi-network mining problem and formulate it as an optimization problem. The key idea is to measure the influence of network elements as the rate of change of the mining results induced by the underlying Sylvester equation. &#8226; Algorithms and Analysis. We propose a family of algorithms (ADMIRING) to effectively solve the adversarial multi-network mining problem, which are applicable to a variety of multi-network mining tasks. &#8226; Empirical Evaluations. We perform extensive experimental evaluations on real-world datasets to test the efficacy of our proposed algorithm. Our evaluations demonstrate that the algorithms can significantly alter the similarity between networks, the accuracy of network classification. The rest of the paper is organized as follows. In Section II, we define the problem of adversarial multi-network mining. Section III introduces our proposed algorithms. We present experimental results in Section IV, and review related work in Section V. We conclude the paper in Section VI.</p><p>II. PROBLEM DEFINITIONS In this section, we formally define adversarial multi-network mining problem, after we introduce notations and preliminaries on multi-network mining as well as influence function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Notations</head><p>Table <ref type="table">I</ref> summarizes the main symbols and notations used in this paper. We use bold uppercase letters for matrices (e.g., A), bold lowercase letters for vectors (e.g., q) and lowercase letters for scalars (e.g., c). For matrix indexing, we use A (i, j) to represent the entry at the i th row and the j th column of matrix A, A (i, :) to denote the i th row of A and A (:, j) to denote the j th column of A.</p><p>In this paper, we focus on a pair of node attributed networks, represented as G 1 = {A 1 , N 1 } and G 2 = {A 2 , N 2 }, where A t (t= 1, 2) represents the adjacency matrices of the input networks. N j t = diag(N t (:, j))(t = 1, 2 and j = 1, . . . , d) represents the strength of all nodes having the j th attribute in network G t . The uppercase bold letter N &#215; is the combined node attribute matrix of the two networks</p><p>For simplicity, we assume the input networks are (a) unweighted, (b) undirected and (c) of the same size. The generalization of the proposed method to weighted and/or directed networks of different sizes is straightforward.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Preliminaries</head><p>We briefly review (1) Sylvester equation for multi-network mining tasks, and (2) influence function for machine learning. 1) -Sylvester equation for multi-network mining. A unifying cornerstone behind many multi-network mining tasks can be attributed to the Sylvester equation defined over the input networks. In detail, given two node-attributed networks G 1 and G 2 , we have the following generalized Sylvester equation</p><p>where c is a regularization parameter,</p><p>1 A 1 , and B &#8712; R n&#215;n encodes the prior knowledge of the mining tasks. For instance, in network alignment <ref type="bibr">[5]</ref>, B is the preference matrix to encode anchor links; and in random walk graph kernel <ref type="bibr">[6]</ref>, B represents the initial probability distribution of the random walks on the direct product matrix. In Eq. (1), X &#8712; R n&#215;n is the solution matrix. A numerical solution of the Sylvester equation usually costs at least O(n 3 ) in time complexity <ref type="bibr">[7]</ref>, and some recent works <ref type="bibr">[8]</ref>, <ref type="bibr">[9]</ref> are able to reduce the time complexity to be linear w.r.t. the input network size. By the Kronecker product properties, we have the following equivalent linear equation of Eq. ( <ref type="formula">1</ref>),</p><p>where x = vec(X), b = vec(B) and</p><p>Remarks. For clarity of the description of the proposed algorithms and analysis, we will mainly focus on a pair of nodeattributed networks. Nonetheless, the proposed algorithms can be naturally generalized to handle other scenarios. For example, both Eq. (1) and Eq. ( <ref type="formula">2</ref>) can be generalized to handle edge attributes as well <ref type="bibr">[5]</ref>, <ref type="bibr">[9]</ref>. If the node and edge attribute information is absent, Eq. ( <ref type="formula">1</ref>) degenerates to X = cA 2 XA 1 +B. If there are multiple (more than two) input networks, we can organize them into a combined network by matrix direct sum G = {A, N}, where A and N are block diagonal matrices and each diagonal block represents one input network. Then Eq. ( <ref type="formula">1</ref>) is defined w.r.t. the combined network</p><p>It turns out the solution matrix X and its vectorization x encodes rich information of the input networks, and therefore have been used for a variety of mining tasks. For example, the random walk based graph kernel <ref type="bibr">[6]</ref> is essentially a summation of all the entries of X linearly weighted by the stopping probability distribution q &#215; of random walks on the direct product matrix; the solution matrix X indicates the soft nodealignment between the input networks and the specific entries indicates the similarity or proximity between two nodes across networks (i.e., cross-network node proximity) <ref type="bibr">[5]</ref>; the solution matrix X defined over a query network and a data network can be further fed into a goodness function <ref type="bibr">[3]</ref> for subgraph matching. Conceptually, we can represent all the above multinetwork mining results induced by the solution matrix X by a function f . For example, f (X) = q &#215; vec(X) for random walk graph kernel <ref type="bibr">[4]</ref>; f (X) = X for (soft) network alignment <ref type="bibr">[5]</ref>. Table <ref type="table">II</ref> presents a summary for different choices of f (&#8226;) for multi-network minignt tasks.</p><p>2) -Influence function for machine learning. Influence function is a powerful analytical tool from robust statistics to evaluate the dependence of the estimator on the value of the data points <ref type="bibr">[10]</ref>. The seminal work by Pang et al. <ref type="bibr">[11]</ref> proposes to leverage influence function to assess the effect of each training example on the performance of the machine learning system, as a key step towards explainable machine learning. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Problem Definition</head><p>Generally speaking, adversarial learning aims to maximally alter the learning results by manipulating a small number of the input data points. In the case of multi-network mining, it translates to a small number of network elements (e.g., edges/nodes/attributes). Given the unifying role of the solution matrix X of Eq. ( <ref type="formula">1</ref>), we formally define the adversarial multinetwork mining problem as follows, Problem 1. Adversarial Multi-network Mining Given: (1) two input attributed networks G 1 and G 2 , (2) the vectorization of the solution matrix X of Eq. ( <ref type="formula">1</ref>), (3) a function f (X) in Table <ref type="table">II</ref> underlying the corresponding mining task, (4) an integer budget k, and (5) the specific network element type (i.e., edge vs. node vs. attribute); Find: a set of k most influential network elements of the specified type so that f (X) will change most if we attack (e.g., remove or alter) those elements.</p><p>Multi-network Mining Tasks Function f (&#8226;)</p><p>Cross-network node similarity <ref type="bibr">[5]</ref> f (X) = X(s, t) </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. ALGORITHMS AND ANALYSIS</head><p>In this section, we first formally formulate Problem 1 from the optimization perspective. Next, we derive the influence functions with respect to various network elements (e.g., edges, nodes and attributes) for multi-network mining tasks. Based on that, we propose effective and efficient algorithms which leverage such influence functions to attack multinetwork mining results, together with some analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. ADMIRING Formulation</head><p>The intuition behind adversarial multi-network mining is to find a set of key network elements (e.g., edges, nodes, attributes) whose perturbation (e.g., removal, alteration) would cause the largest change of the function f (&#8226;) underlying a given mining task. For example, for random walk graph kernel, the goal of an adversarial attack is to significantly change the similarity (i.e., graph kernel) between two input networks by deliberately perturbing a small set of influential network elements. To be specific, let X be the original solution of Eq. ( <ref type="formula">1</ref>) for the input networks G 1 , G 2 and X P be the new solution matrix for networks G 1P and G 2 after we perturb the network elements in set P. The corresponding mining results are f (X) and f (X P ), respectively. We formally formulate Problem 1 as the following optimization problem, argmax</p><p>Note that for network alignment, the squared loss in the above formulation will be replaced by the L 2 norm if f (X) = vec(X) or the Frobenius norm if f (X) = X. In order to solve the above optimization problem, there are two crucial questions that need to be answered: (Q1) how to quantitatively evaluate the influence of a specific network element w.r.t. the function f (&#8226;) over the solution X of the Sylvester equation; and (Q2) how to leverage the influence function to identify a set of network elements to attack various multi-network mining tasks. In the next two subsections, we present our solutions to Q1 and Q2 according to the specific type of network elements (i.e., edges, nodes and attributes), respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Network Element Influence</head><p>In order to achieve effective attacks to multi-network mining tasks, it is desirable that the change to the network structure would significantly impact the mining results (i.e., f (X) in Table <ref type="table">II</ref>). For simplicity and clarity, we only consider three types of attacks, including edge removal, node removal and node attribute alteration and we assume the attack always happens on the first network G 1 . In order to quantify how f (X) changes (i.e., &#8710;f in Eq. ( <ref type="formula">3</ref>)) if we perturb a specific network element, we propose to use the influence function w.r.t. the corresponding multi-network mining tasks. Definition 1. Edge Influence in Multi-network Mining. For a given multi-network mining task f (X), the influence of a specific edge (e.g., A 1 (i, j) in G 1 ) w.r.t. the mining result is defined as the derivative of f (X) w.r.t. this edge. Formally, the edge influence is defined as</p><p>&#8706;A1(i,j) . Definition 2. Node Influence in Multi-network Mining. The node influence is defined as the summation of influences of the incident edges, i.e., I (N 1 (i)) = j|A1(i,j)=1 I (A 1 (i, j)) Definition 3. Node Attribute Influence in Multi-network Mining. For node-attributed networks, the influence of the l th attribute of node i (i.e., N l 1 (i, i)) in network G 1 is defined as the derivate of f (X) w.r.t. this specific node attribute, i.e.,</p><p>&#8706;N l 1 (i,i) . Next, we present details on how to compute the influence of network elements (e.g., edges, nodes and attributes) w.r.t. the mining results in the various tasks (e.g., random walk graph kernel, network alignment and cross-network node similarity in Table <ref type="table">II</ref>). For clarity, we will mainly use random walk graph kernel as an example to illustrate the mathematical details to compute different network element influences. 1) Edge influence. We first give Lemma 1 to compute the edge influence for random walk graph kernel. Lemma 1. (Edge Influence for Random Walk Graph Kernel.) Given random walk graph kernel between two input networks: f (X) = q &#215; (I -cN &#215; A &#215; )</p><p>-1 N &#215; p &#215; , the influence of a specific edge (e.g., A 1 (i, j) in G 1 ) w.r.t. random walk graph kernel can be calculated as follows,</p><p>where</p><p>, S i,j is a single-entry matrix of the same size as A 1 , with 1 at the (i, j) th position and 0 elsewhere. Proof. According to <ref type="bibr">[4]</ref>, the random walk graph kernel for node-attributed networks is,</p><p>where vec(X)</p><p>Following Definition 1, we take the partial derivative of f (X) w.r.t. a specific edge (e.g., A 1 (i, j) in network G 1 ),</p><p>According to the first row of Table <ref type="table">II</ref>, we have that &#8706;f (X) &#8706;x = q &#215; . For the second partial derivative (i.e., &#8706;x &#8706;A1(i,j) ), by taking the derivative of Eq. ( <ref type="formula">2</ref>), we have that</p><p>By the property of the matrix derivative [12, <ref type="bibr">Page 8]</ref>, we further have that</p><p>Recall the closed-form solution x = (I -</p><p>Putting everything together, we obtain the solution for calculating the influence of edge A 1 (i, j) w.r.t. random walk graph kernel as follows,</p><p>which completes the proof.</p><p>2) Node influence. Based on the edge influence (Eq. ( <ref type="formula">4</ref>)), it is straight-forward to compute the node influence, which is summarized in the following proposition. Proposition 1. (Node Influence in Random Walk Graph Kernel.) Given random walk graph kernel between two input networks: f (X) = q &#215; (I -cN &#215; A &#215; ) -1 N &#215; p &#215; , the influence of a specific node (e.g., N 1 (i) in G 1 ) w.r.t. random walk graph kernel can be calculated as,</p><p>Proof. It directly follows Lemma 1. Omitted for brevity.</p><p>3) Node attribute influence. Finally, we give Lemma 2 to compute the node attribute influence. Lemma 2. (Node Attribute Influence for Random Walk Graph Kernel.) Given random walk graph kernel between two input networks: f (X) = q &#215; (I -cN &#215; A &#215; )</p><p>-1 N &#215; p &#215; , the influence of a specific node attribute (e.g., the l th dimension of the attribute vector of node N 1 (i) in G 1 , i.e., N l 1 (i, i)) w.r.t. random walk graph kernel can be calculated as follows,</p><p>-1 , S i,i is a single-entry matrix of the same size as A 1 , with 1 at the (i, i) th position and 0 elsewhere. N l 2 represents the strength of all the nodes having the l th attribute from the network G 2 . Proof. Omitted for brevity. C. Proposed Algorithms 1) A generic algorithm for adversarial multi-network mining. Based on Lemma 1, 2 and proposition 1, we propose Algorithm 1 to identify the most influential network elements (i.e., edges, nodes, attributes) for random walk graph kernel. Note that Algorithm 1 provides a family of attacking algorithms based on the specific network element. We use different suffix to differentiate different attacking scenarios, i.e., ADMIRING-E, ADMIRING-N, ADMIRING-A for attacking edges, nodes and node attributes respectively.</p><p>The key idea of the proposed ADMIRING algorithm is that we iteratively attack one network element with the highest influence value <ref type="bibr">9,</ref><ref type="bibr">16)</ref> Calculate influence for all edges using Eq. ( <ref type="formula">4</ref>)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5:</head><p>Add edge (i, j) = argmax (i,j) I (A 1 (i, j)) to P;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6:</head><p>Remove edge (i, j) and (j, i) from G 1 ; 7:</p><p>else if element type is node then</p><p>Compute influence for all nodes in G 1 by Eq. ( <ref type="formula">9</ref>); Set the value of damping factor &#945; &#8712; (0, 1);</p><p>for node i in G 1 do 14:</p><p>Compute influence of each attribute by Eq. ( <ref type="formula">10</ref>); Add node attribute argmax</p><p>Reduce the selected attribute by a ratio &#945; in G 1 ;</p><p>18:</p><p>return Error 20:</p><p>end if 21: end while 22: return P and G 1 P . network (Step-6, 10, 17) and recompute the influence functions for the remaining network elements. In Algorithm 1, the two column vectors p &#215; and q &#215; are set as uniform unit vectors, i.e., p &#215; = q &#215; = 1 &#215; 1 n 2 , and c is chosen as a small positive number to ensure convergence of the Sylvester Eq. ( <ref type="formula">1</ref>) with fixed-point methods (e.g., c = 1/(max(eigenvalue(N 1 A 1 )) &#215; max(eigenvalue(N 2 A 2 )) + 1)).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. EXPERIMENTAL EVALUATION</head><p>In this section, we conduct experiments to evaluate the proposed algorithms in terms of the following question: 1) Datasets. We use three real-world datasets which are publicly available. Table <ref type="table">III</ref> summarizes the statistics of the datasets. Detailed descriptions of these datasets are as follows.</p><p>&#8226; MUTAGENICITY is a dataset of 4,337 networks of molecular structures and it is divided into two classes mutagen (2,401 networks) and nonmutagen (1,936 networks) according to whether they have a property of mutagenicity <ref type="bibr">[13]</ref>.</p><p>&#8226; PROTEINS is a dataset of 1,113 protein structures <ref type="bibr">[1]</ref>. Each protein is represented by a network. The task is to classify the protein structures into enzymes vs. non-enzymes. &#8226; IMDB-BINARY is a movie collaboration dataset which collects cast and genre information of two types of movies, romance and action on IMDB. For each network, nodes represent actors/actresses and there is an edge between them if they appear in the same movie, and the task is to predict the genre of the movie the network represents <ref type="bibr">[14]</ref>. 2) Evaluation Metrics. We quantify the effectiveness of the proposed algorithms by measuring the following two aspects, including (1) the relative change of f (&#8226;) (i.e., &#8710;f f ), and (2) the accuracy of a multi-network mining task (e.g., network classification) before and after performing adversarial attacks.</p><p>We perform the evaluations on random walk graph kernel, (1) on each selected dataset, we randomly selected 100 pairs of networks of the same class, then for every pair of networks, we apply ADMIRING or comparison methods to attack one of them, and re-compute the graph kernel to compare the relative change; (2) we also trained a SVM classifier with graph kernel for each of the three datasets; for every network (i.e., G 1 ) in the testing set, we attacked it with ADMIRING or comparison methods to reduce its similarity with one random training network (i.e., G 2 ) . The new classification result is from applying the trained SVM classifier on the attacked test networks. We report the average classification accuracy by repeating the above attacking strategy for 10 times.</p><p>3) Comparison Methods. We evaluate the proposed ADMIR-ING method with different attacking strategies for attacking edges, nodes and node attributes respectively. We also evaluate a batch-mode variant of ADMIRING. That is, in Algorithm 1, instead of selecting one network element at each iteration, we select the top-k elements with the highest influence scores in one iteration. We use a suffix 'v' to denote such a variant. For comparison, we use the following methods to select the topk influential network elements, including (1) Random, which randomly select k network elements in the first network to attack; (2) Bruteforce, which re-computes f after attacking each element and selects the one with the highest &#8710;f in each iteration; (3) Bruteforce-v, which uses Bruteforce in the batch mode; (4) PageRank, which measures the importance of nodes in a given network <ref type="bibr">[15]</ref> and here we leverage the PageRank ranking results to select top-k influential network elements. In G 1 , we define the PageRank value of an edge A 1 (i, j) as, v(A 1 (i, j)) = (r(i) + r(j)) &#215; max </p><p>Recall that Q is an n 2 &#215; n 2 matrix and is closely correlated to the solution of the multi-network mining problems as shown in Table <ref type="table">II</ref>. We can use a block-matrix representation (i.e., Q = [W ij ] i,j=1,...,n , where W ij is at the (i, j) th position of Q with size n &#215; n). In edge attack, the aggregation of the entries in block matrix W ij can be considered as the contribution of the edge A 1 (i, j) to the mining result (i.e., f (X)). We can compute the influence of edge A 1 (i, j) by the aggregation of the entries in W ij and select the top-k edges.</p><p>In node attack, we calculate the influence of node i in G 1 by summing up all blocks W ij | A1(i,j)=1 and select the top-k nodes. In node attribute attack, we compute the influence of N l 1 (i, i) as n j=1 W ij (l, :) and select the top-k attributes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Effectiveness Results</head><p>We compare the proposed ADMIRING with comparison methods in the task of attacking random walk graph kernel with different types of network elements, i.e., edges, nodes and attributes. The results of &#8710;f f are summarized in Figures <ref type="figure">2,</ref><ref type="figure">3</ref>, and 4, respectively. We can see that the proposed ADMIRING and its batch-mode variant ADMIRINGv (two red bars) achieve a very similar performance as their bruteforce counterparts (two yellow bars). As we will show in the next subsection, the bruteforce methods are much more expensive in terms of computation. In the meanwhile, the proposed methods (two red bars) consistently outperform all the remaining methods by a large margin in the three attacking scenarios across all datasets. For example, on IMDB-BINARY dataset, the proposed method ADMIRING-E is 4.31 times better than the best competitor method (i.e., ADMIRING-E vs. Q-Matrix) by attacking edges; on MUTAGENICITY dataset, by attacking nodes, the proposed ADMIRING-N is 269% better than Q-Matrix, and for node attribute attack, our proposed method ADMIRING-A is 2.36 times better than Q-Matrix on PROTEINS dataset. For the reported results in Figures <ref type="figure">2, 3</ref> and<ref type="figure">4</ref>, the budget k is set to be 10, and the damping factor &#945; is 0.2 for result in Figure <ref type="figure">4</ref>.</p><p>Second, we evaluate and compare the impact of different attacking methods on the network classification results, summarized in Figure <ref type="figure">5</ref>. We can see both the ADMIRING method and the bruteforce method can significantly impact the classification results. For example, on PROTEINS dataset, our proposed method can reduce the classification accuracy by 8.82%, 10.52%, and 6.98% by attacking edges, nodes and attributes, respectively. On the other hand, the attacking effect by other methods is quite marginal. For example, the best competitor method (Q-Matrix), can only achieve a reduction of 3.35% in classification accuracy by attacking nodes.</p><p>V.   come or reveal patterns . Graph kernel is a family of methods that measure the similarity between two input networks based on walks <ref type="bibr">[4]</ref>, <ref type="bibr">[6]</ref> or limited-sized subgraphs <ref type="bibr">[16]</ref>. Network alignment aims to identify node correspondence across different networks. Some recent work suggests that the alignment can be enhanced by augmenting the topology consistency with the node and edge attributes <ref type="bibr">[5]</ref>, <ref type="bibr">[8]</ref>. Multiple networks are manifested as a network of networks, where each node of the main network itself is another domain network <ref type="bibr">[17]</ref>, <ref type="bibr">[18]</ref>. The main network can contextualize the mining tasks in each domain-specific network by providing the consistency constraints across networks for both ranking <ref type="bibr">[17]</ref> and clustering <ref type="bibr">[19]</ref>. With the increased complexity of the networked system, some efforts have been towards explainable network mining <ref type="bibr">[20]</ref>- <ref type="bibr">[22]</ref>. Adversarial Learning studies the learning process in the adversarial setting, where an adversary can attack the learning model <ref type="bibr">[23]</ref>. In white-box evasion attack, the adversary has the full knowledge of the learning model (i.e., parameters and hyperparameters). The adversarial example is crafted by perturbing the most salient features. More recently, adversarial attacks on network mining algorithms start to receive attentions, where perturbations to the network attributes and structure can be constructed to mislead the graph convolution models <ref type="bibr">[24]</ref>, <ref type="bibr">[25]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. CONCLUSION</head><p>In this paper, we study the problem of adversarial multinetwork mining and formulate it as an optimization problem. The key idea is to effectively characterize the change of mining results w.r.t. the perturbations to the network. We propose a family of algorithms (ADMIRING) that are able to measure the influence of network elements to the mining results. The empirical evaluations on real-world datasets demonstrate the efficacy of the proposed algorithms. Future work includes generalizing the current method beyond Sylvester equation based solution in the black-box model setting as well as designing effective defensive strategies.</p></div></body>
		</text>
</TEI>
