<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>TenGAN: adversarially generating multiplex tensor graphs</title></titleStmt>
			<publicationStmt>
				<publisher>Springer</publisher>
				<date>01/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10509445</idno>
					<idno type="doi">10.1007/s10618-023-00947-3</idno>
					<title level='j'>Data Mining and Knowledge Discovery</title>
<idno>1384-5810</idno>
<biblScope unit="volume">38</biblScope>
<biblScope unit="issue">1</biblScope>					

					<author>William Shiao</author><author>Benjamin A Miller</author><author>Kevin Chan</author><author>Paul Yu</author><author>Tina Eliassi-Rad</author><author>Evangelos E Papalexakis</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>In this work, we explore multiplex graph (networks with different types of edges) generation with deep generative models. We discuss some of the challenges associated with multiplex graph generation that make it a more difficult problem than traditional graph generation. We propose T<sc>en</sc>GAN, the first neural network for multiplex graph generation, which greatly reduces the number of parameters required for multiplex graph generation. We also propose 3 different criteria for evaluating the quality of generated graphs: a graph-attribute-based, a classifier-based, and a tensor-based method. We evaluate its performance on 4 datasets and show that it generally performs better than other existing statistical multiplex graph generative models. We also adapt HGEN, an existing deep generative model for heterogeneous information networks, to work for multiplex graphs and show that our method generally performs better.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Graphs are used to represent many different types of data-from protein interactions <ref type="bibr">(Pavlopoulos et al. 2011</ref>) to social networks <ref type="bibr">(Newman et al. 2002)</ref>. There are a similarly large number of useful graph-related tasks, like link prediction and node classification. One of these tasks is graph generation, which is the focus of this work. Graph generation models can generally be split into two types: statistical attributebased generative models and deep generative models. Statistical generative models like the <ref type="bibr">Erd&#246;s and R&#233;nyi (1959)</ref>, <ref type="bibr">Barab&#225;si and Albert (1999)</ref>, and stochastic block models <ref type="bibr">(Holland et al. 1983</ref>) have explicitly defined parameters like the attachment rate or the number of communities.</p><p>Responsible editor: Charalampos Tsourakakis.</p><p>Extended author information available on the last page of the article 1 3</p><p>In contrast, many deep generative models can learn directly from one or more input graphs. This ability allows them to mimic attributes of the input dataset without defining them explicitly. The majority of these models are based on either RNNs (Recurrent Neural Networks), GANs (Generative Adversarial Networks) <ref type="bibr">(Goodfellow et al. 2020)</ref>, or VAEs (Variational Autoencoders) <ref type="bibr">(Kingma and Welling 2013)</ref>. Examples of these include GraphRNN <ref type="bibr">(You et al. 2018)</ref>, NetGAN <ref type="bibr">(Bojchevski et al. 2018)</ref>, GraphVAE <ref type="bibr">(Kipf and Welling 2016)</ref>, and LGGAN <ref type="bibr">(Fan and Huang 2019)</ref>.</p><p>However, sometimes a simple graph structure may not be sufficient to represent a dataset accurately. One example of this is the Enron dataset <ref type="bibr">(Klimt and Yang 2004)</ref>, where each node is a person and each edge is an email between them. Representing this as a standard graph would only show that two people communicated with each other, without any information on when they communicated. For example, the two people may have exchanged an email once, daily, or once a month-each of which would indicate a very different relationship. Representing this data as a multiplex graph allows us to fully represent this information.</p><p>Another use-case for a multiplex graph is to capture different types of social media interactions across the same users. Each node would represent a person and each edge type could represent a different form of interaction. An example of this could be a graph representing Twitter interactions, where each edge could either represent a retweet, mention, or following. Using only one of these would unecessarily limit the information captured in the graph.</p><p>Traditional graph generation models and multiplex graph generation models are useful in many of the same ways. For example, they can be used to anonymize private data <ref type="bibr">(Wang and Wu 2013)</ref> in order to enhance the reproducibility of models trained on private datasets. This would allow the user to preserve interesting structures in the graph without leaking private user information.</p><p>However, all of the current multiplex network generation models are statistical generative models. BINBALL <ref type="bibr">(Basu et al. 2015)</ref> adapts ideas from BA and ER models and proposes new multiplex preferential attachment rules. StarGen <ref type="bibr">(F&#252;genschuh et al. 2018</ref>) further builds upon BINBALL by separating the parameters controlling the global and local degree of nodes, increasing the diversity of individual layers. ANGEL <ref type="bibr">(F&#252;genschuh et al. 2020</ref>) uses a hub-and-spoke-based model to generate multiplex graphs, allowing it to better mimic certain structures. These existing works pose some major limitations, namely:</p><p>1. Explicit parameterization. The existing models declare a set of parameters that affect the output of the graph. These parameters are inflexible and may lead to overfitting to a particular graph attribute while neglecting another. 2. Limited datasets. All three of the methods focus on Airline Transportation Networks (ATNs), specifically on the EU airline dataset <ref type="bibr">(Cardillo et al. 2013)</ref>. It is difficult to determine if the methods will work for different datasets. In this work, we explore the task of multiplex graph generation on datasets from wildly varying domains. 3. Limited evaluation criteria. It is difficult to quickly compare the performance of the models on different datasets. The performance evaluation of the models 1 3</p><p>TENGAN: adversarially generating multiplex tensor graphs is primarily done visually by comparing the distributions of various topological properties. This also ignores some other potentially interesting characteristics of the generated graphs like whether or not a classifier can distinguish between real and generated samples, especially useful for tasks like graph anonymization.</p><p>The first problems can be resolved by using a neural network to learn directly from a set of input graphs. However, extending a traditional graph generation network is not straightforward since the layers are often correlated. There are also many more parameters required. Furthermore, it is difficult to evaluate the quality of the generated graphs. We further investigate these issues in Sect. 3 below.</p><p>In this work, we tackle these issues and propose TENGAN, a tensor-based GAN, to generate multi-view graphs. With minor modifications, our approach readily generalizes to other data sources that can be modelled well with tensor decompositions. For example, tensor decompositions have been shown to work well on a variety of data, including fMRI <ref type="bibr">(Noroozi and Rezghi 2020)</ref> and EEG data <ref type="bibr">(Cong et al. 2015)</ref>, NBA game data <ref type="bibr">(Papalexakis and Pelechrinis 2018)</ref>, network traffic data <ref type="bibr">(Baskaran et al. 2019)</ref>, and spatio-temporal urban computing data <ref type="bibr">(Wang et al. 2014)</ref>. However, in this work, we focus on the domain of multiplex graphs and reserve exploring other domains for future work.</p><p>Our contributions include:</p><p>&#8226; Novel method: We propose a novel GAN-based method to generate multiplex graphs that uses tensor decomposition to reduce the number of parameters required.</p><p>&#8226; Evaluation criteria: We propose 3 different evaluation metrics for multiplex graph generation and evaluate their effectiveness. &#8226; Thorough experimentation: We conduct thorough experiments on 4 different datasets across 2 different models to evaluate the performance of our method. We also modify an existing method for heterogeneous graphs to work with multiplex graphs and compare our method against it.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Background</head><p>We first enumerate important background information and notation for this work.</p><p>We briefly define and describe multiplex graphs, tensors, tensor decompositions, and generative adversarial networks. A table of symbols can be found in Table <ref type="table">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Multiplex graphs</head><p>In this work, we focus on the generation of multiplex graphs. A multiplex graph consists of several views, where each view is a graph. Each of the views contains the same nodes, but with different edges. Formally, a multiplex graph G with k views can be written as</p><p>, where V is the shared vertex set and E i are the edges at the i-th layer. It is worth noting that not all of the nodes need to be connected within each view. One example of a multiplex graph is a social network, where each edge in a given view represents a different mode of communication. A time-evolving graph with a constant number of nodes could also be represented as a multiplex graph, with each view representing a different timestamp. Finally, a knowledge base could be seen as a multiplex graph, where each node represents an entity and each view represents a different relation. It is worth noting that this differs from the traditional notion of a multigraph, where nodes can have multiple edges between them but each edge is of the same type. Edges in a multiplex graph can have multiple types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Tensors &amp; decompositions</head><p>Networked data (e.g., a social network) can be represented in many different formats. One of the most common formats is to use an adjacency matrix. In a similar</p><p>Table 1</p><p>Table of symbols and their description Notation Meaning X, X , x , x Tensor, matrix, vector, scalar [[A, B, C]] The tensor constructed from the factor matrices A, B, C. &#8214;X&#8214; F Frobenius norm g (i)</p><p>The i-th view of multiplex graph g</p><p>The value in the i-th, j-th, and k-th entry along the 1st, 2nd, and 3rd mode of the tensor T</p><p>The k-th frontal slice of a tensor T . That is, T i,j,k &#8704;j&#8704;i NEIGH(u) The set of neighbors of the node u 1 3</p><p>TENGAN: adversarially generating multiplex tensor graphs manner, a multiplex graph can be viewed as a third-order tensor where T i,j,k = w if there is an edge of weight w (or 1 in the case of an unweighted graph) between node i and node j in view k. A visual example of this is shown below in Fig. <ref type="figure">1</ref>. One advantage of storing multiplex graphs in this tensor format is that it allows us to easily apply tensor decomposition methods. One of the most common tensor decomposition methods is the CANDECOMP/PARAFAC or Canonical Polyadic Decomposition (CPD) <ref type="bibr">(Hitchcock 1927;</ref><ref type="bibr">Kolda and Bader 2009)</ref>. Given an integer r, the CPD decomposes a tensor T into the sum of r outer products of vectors. While the CPD can be applied to a tensor of any order, we focus on the third-order case in this work. Thus, the CPD can be written as: where a i , b i , c i are the factor vectors. It is convention to write the factors as matrices A, B, C , which consist of the corresponding vectors horizontally stacked. For exam- ple, the i-th column of A would be a i .</p><p>Another tensor decomposition method that has been shown to work well in the domain of knowledge bases and other multiplex graphs is RESCAL <ref type="bibr">(Nickel et al. 2011)</ref>. Given a I &#215; J &#215; K tensor T , RESCAL factors each slice as <ref type="bibr">Kipf and Welling (2017)</ref> propose graph convolutional networks (GCNs), which convolves the features of a given node with the features of its neighbors. Each additional GCN layer incorporates information from another hop in the graph. Formally, each layer of the GCN can be written as:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Graph convolutional networks (GCNs)</head><p>where h (k)  u is the representation of u at the k-th layer, W (k) are the weights of k-th layer, and h (0)  u are the node features of u. Note that we focus on featureless multiplex graphs in this work. As such, we set h (0)  u = I ; essentially setting the node features of each node equal to its one-hot encoding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Generative adversarial networks (GANs)</head><p>In this work we use a GAN <ref type="bibr">(Goodfellow et al. 2020</ref>) architecture which, in its general form includes a generator network that generates "fake" data and a discriminator network that distinguishes fake from real. Both networks trained in unison and are engaged in a game of outperforming each other, with the end goal being that the generator can essentially learn the distribution of real data. One challenge of this</p><p>approach is that we need a distribution of inputs to model over, rather than just a single sample. We discuss how we address this in Sect. 4.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Problem formulation</head><p>We consider the following problem: Some examples of criteria in C may include graph attributes (like clustering coef- ficient and degree distribution) or the correlation between different slices. However, it is not fully clear what these criteria should be and is one of several challenges in multiplex graph generation, some of which are listed below: Challenge 1: Inadequate Evaluation Criteria Before we can decide on an appropriate model, we must determine our evaluation criteria. This is challenging because we need to consider not only the graph attributes of each view but also the relationships between each view. This means that many of the graph evaluation metrics commonly used in graph generation are insufficient for multiplex graph generation. In this work, we propose 3 methods of evaluation for multiplex graph generation models, described in Sect. 4.4 below. Challenge 2: Sampling from Multiplex Graphs Many existing multiplex datasets consist of a single graph with multiple views. However, since we are emulating a set of multiplex graphs, we need many smaller multiplex graphs to form a distribution. In the traditional graph setting, there are existing datasets that consist of multiple graphs (e.g., the PPA dataset <ref type="bibr">(Hu et al. 2020)</ref>). However, to the best of our knowledge, there are no such existing datasets for multiplex graphs. Therefore, we need a sampling method that samples smaller multiplex sub-graphs from a larger multiplex graph. We describe our method in Sect. 4.1 below. Challenge 3: Large Number of Parameters If we naively attempt to generate a multiplex graph, the number of parameters required will explode. This is because we need parameters to generate an adjacency tensor for a multiplex graph with k views and n nodes. We propose TENGAN (described in Sect. 4.2) that generates a compressed tensor-decomposition-based representation to solve this problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">3</head><p>TENGAN: adversarially generating multiplex tensor graphs</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Proposed method</head><p>We propose TENGAN, a GAN-based model that first generates factors of a tensor decomposition model, then uses those to generate the adjacency tensor. We propose two variants: TENGAN-CP, which uses the CPD and TENGAN-R, which uses the RESCAL decomposition. We first sample sub-multiplex graphs from the dataset, as described in Sect. 4.1. Then, we train our GAN on the sampled multiplex graphs. Finally, we sample random graphs from the generator (by passing in different random noise vectors) and evaluate them with the metrics described in Sect. 4.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Sampling</head><p>Many generative models require multiple input samples, rather than a single example. For example, LGGAN (Fan and Huang 2019) is trained on 2-hop and 3-hop egonets extracted from the original source graph. However, it has been shown that this can lead to biased samples that may not necessarily be representative of the original graph, especially in terms of in-degree and community structure <ref type="bibr">(Leskovec and Faloutsos 2006)</ref>. To help avoid this, we perform random-walk sampling across each view. We then use the induced subgraph on the remainder of the views.</p><p>However, one issue with many large multiplex datasets is that most of the nodes may be disconnected in any given view. In extreme examples, like in some knowledge graphs, almost all nodes will be disconnected in each view. Oftentimes, even the union of all edges across all views will still result in a disconnected graph.</p><p>Another issue is that these datasets are often too large or have too many views. For example, the NELL dataset <ref type="bibr">(Carlson et al. 2010;</ref><ref type="bibr">Smith et al. 2017</ref>) has over 2 million views. Random sampling of the dataset would produce extremely or completely sparse entries. In order to produce better quality samples, we use the sampling method of ParCube <ref type="bibr">(Papalexakis et al. 2012)</ref>.</p><p>This computes the following importance score for each slice along each mode. For example, the importance scores for each slice along the three modes for a tensor T would, respectively, be:</p><p>We then randomly sample indices for each mode, with the probability of a given index being selected proportional to its score. For example, a given index i is selected with probability a i &#8725; &#8721; I x=1 a x . This results in a more dense tensor and there- fore a multiplex graph with more connected nodes, reducing the chance of getting empty (or near-empty) tensors as inputs to our model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Architecture</head><p>Our model is a GAN and consists of a generator network and a discriminator network. The generator consists of a MLP, followed by two or three smaller MLPs to generate the factor matrices/tensors (depending on the factorization method). We further describe the two generator architectures below in Sects. 4.2.1, and 4.2.2. The discriminator uses the max pool of several Graph Convolutional Networks (GCNs) <ref type="bibr">(Kipf and</ref> Welling 2017) (one per view) followed by a fully-connected layer to predict if a sample is generated or drawn from the original real dataset. A diagram of our architecture is shown in Fig. <ref type="figure">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1">TENGAN-CP architecture</head><p>TENGAN-CP uses a shared feature extractor layer and splits into separate networks, each of which generates a different factor in the CPD. This can be considered a higher-order extension of the BRGAN-B (Shiao and Papalexakis 2021) architecture and uses the tensor CPD instead of the matrix SVD.</p><p>After generating the factors, we calculate the sum of the outer products of vectors from our factor matrices A, B , and</p><p>As shown in Sect. 4.3, this reduces the number of parameters needed to generate a given multiplex graph. 1 3 TENGAN: adversarially generating multiplex tensor graphs We use the loss function from Wei et al. (2018) (which is based on Arjovsky et al. (2017)):</p><p>where GP is the gradient penalty term from <ref type="bibr">Gulrajani et al. (2017)</ref>, CT is the consistency term from <ref type="bibr">Wei et al. (2018)</ref>, and A, B, C is the output from the generator. &#8473; g denotes the distribution of graphs generated by the generator, and &#8473; r denotes the real distribution of graphs. D(&#8901;) denotes the output of the discriminator on a given tensor. The goal of this loss function is to minimize the difference between the expected values of the discriminator's output on the generated data and the input (real) data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2">TENGAN-R architecture</head><p>We also propose the TENGAN-R architecture, which is initially similar to the TEN-GAN-CP architecture, with the distinction that we use the RESCAL decomposition instead of the CPD. This results in more parameters for the same value of r, but performs better on certain datasets (Table <ref type="table">3</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Parameter complexity</head><p>If we attempted to generate an adjacency tensor for a multiplex graph with n nodes and k views directly, we would have to use parameters in the final layer. However, if we generate the CPD factors first, we only need parameters in the final layer, where r is a hyperparameter that increases the quality of the fit at the cost of more parameters. This offers savings for r &lt; n 2 , and we show that our models work well for this case in Table <ref type="table">3</ref>. For the RESCAL-based formulation, we need parameters in the final layer. In this case, we only reduce the number of parameters in the case where r &lt; n. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Evaluation metrics</head><p>Another difficult task in multiplex graph generation is evaluating the quality of the generated graphs. <ref type="bibr">Gretton et al. (2012)</ref> found that measuring the Maximum Mean Discrepancy (MMD) between distributions of different graph statistics works well for simple graphs. We propose 3 methods for evaluating the structural similarity between generated and input graphs:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.1">MMD-based evaluation</head><p>One method to evaluate the quality of generated multiplex graphs would be to apply the evaluation criteria used for simple graphs to each view. We measure the Mean MMD (M-MMD) score between the distributions of different graph attributes. More concretely, for each graph attribute, we take the mean of the MMD between the i-th view of a generated graph G &#8242; and a graph G. We can use the clustering coefficient, degree distribution, and the orbit of the graphs similar to <ref type="bibr">You et al. (2018)</ref>.</p><p>The main downside of this approach is that it does not take the relationship between the views into account. For example, consider the case where we generate multiplex graphs with two views. Let the list of the generated first and second views be</p><p>Then, suppose the MMD scores of V &#8242; 1 and V &#8242; 2 across all the graph attributes are 0. Then, the overall Mean MMD (M-MMD) would be 0. However, swapping V &#8242; 1 and V &#8242; 2 , yields same M-MMD. This is clearly an undesirable behavior in any case where each of the views are correlated with each other. An extreme example of this would be a multiplex graph g where g (1) has an edge iff g (2) does not have an edge. Then, it is possible for a generated graph g &#8242; to have g &#65533;(1) = g &#65533;( <ref type="formula">2</ref>) , but still have a perfect M-MMD of 0 in all the graph attributes. To address this issue, we propose the tensor-based evaluation method below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.2">Tensor-based evaluation</head><p>Multiplex graphs can be viewed as third-order tensors, where each slice is a graph across the same nodes, and tensor decompositions have been shown to be able to extract structure (like communities) from multiplex graphs <ref type="bibr">(Gujral and Papalexakis 2018;</ref><ref type="bibr">Gauvin et al. 2014;</ref><ref type="bibr">Al-Sharoa et al. 2017;</ref><ref type="bibr">Sheikholeslami and Giannakis 2018)</ref>. We take advantage of this fact by applying the CPD to each multiplex graph or tensor. The normalized reconstruction error of the decomposition for various values of r provides a heuristic for how much structure there is along the three modes of the tensor. We then compare the errors of the generated and original tensors across different ranks to see if they are similar in terms of trilinear structure. It may be possible to produce a similar reconstruction error for a given rank without matching the structure of the real graph, but we argue that it is highly unlikely for this to occur across many different values of r.</p><p>We randomly sample n tensors from the generated and real tensors and compute an error vector e of the errors across different ranks. We then calculate the sum of the Wasserstein metric (a.k.a. the earth mover's distance: EMD) between all n 2 pairs 1 3</p><p>TENGAN: adversarially generating multiplex tensor graphs of error vectors. The lower this score, the more similar pairs are (on average). While this score works well across a fixed dataset, it is difficult to compare this score across datasets of different sizes. This is because the number of feasible r values changes with the size of the tensor; and a given dataset may naturally have a wider range of pairwise distances (Fig. <ref type="figure">3</ref>).</p><p>To solve this issue, we normalize the sum of generated-real distances by the sum of pairwise real-real distances. More formally, given real error matrix E and gener- ated error matrix &#8242; (where every row E i is a vector of the i-th sample's CPD errors):</p><p>The lower the TenScore, the more realistic the generated samples are. TenScore also serves as an indicator for graph diversity. If it near 0, it likely means that the model is suffering from mode collapse-a common problem among GANs. We also propose a modified version of TenScore for knowledge base graphs: TenScore-R, which uses RESCAL's error.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.3">Classifier-based evaluation</head><p>We train a classifier on generated and original data; then check to see if it correctly predicts the origin of an example. We calculate the accuracy and F1 score of the resulting model (the closer to 0.5 or 50%, the better). In the model, we calculate a graph2vec <ref type="bibr">(Narayanan et al. 2017)</ref> embedding for each view of the multiplex graph. Then, we split the embeddings into training/test data and train a SVM classifier for each view. Finally, we take the majority vote of the ensemble. These steps are shown in a diagram in Fig. <ref type="figure">4</ref>.</p><p>(2)</p><p>Fig. <ref type="figure">3</ref> Three plots of pairs of CPD error values that have low, medium, and high EMD scores. CPD error vs. rank plot on three pairs of tensors. The dashed green lines are the generated results, and the solid blue lines are the original results. We can see that the EMD scores are lower when the lines are more similar and that the scores are higher when the lines are further apart. This is the intuition behind our tensorbased evaluation method</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5">Implementation details</head><p>We implemented this model in PyTorch <ref type="bibr">(Paszke et al. 2019</ref>) on Python 3.9. We used NetworKit <ref type="bibr">(Staudt et al. 2016)</ref> and NetworkX <ref type="bibr">(Hagberg et al. 2008)</ref> for graph data, and Tensorly <ref type="bibr">(Kossaifi et al. 2016</ref>) for tensor decompositions. We extended portions of the GraphRNN <ref type="bibr">(You et al. 2018</ref>) evaluation code and heavily modified HGEN <ref type="bibr">(Ling et al. 2021)</ref> to work for multiplex graphs (see details in Sect. 5.2). We use the code from <ref type="bibr">Rossetti (2020)</ref> for the BINBALL, StarGen, and ANGEL baselines. It was originally written for undirected networks, so we extend it to work for directed multiplex graphs. The code for our experiments is available here.<ref type="foot">foot_0</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Experimental evaluation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Datasets</head><p>We used 4 multiplex graph datasets selected to represent a wide variety of data types, including a social media network, a knowledge graph, a computer network communication graph, and a time-evolving network.</p><p>1. Football <ref type="bibr">(Greene and Cunningham 2013)</ref>: 248 English Premier League football players and clubs on Twitter, where each of the 6 views corresponds to a different interaction between the accounts (follows, followed-by, mentions, mentioned-by, retweets, retweeted-by). Note that 3 of the views are essentially transposes of the other 3. 1 3 TENGAN: adversarially generating multiplex tensor graphs</p><p>Table 3 Results of TENGAN on the football, NELL-2, enron, and comm datasets. Lower is better for all of these metrics, including F1/Acc. deg, clust, and orbit are the mean MMD scores of our MMD-based evaluation method (see Sect. 5.3) Bold values indicate the best score for that column (dataset and metric) F1 and Acc are the scores produced by the classifier-based method (see Sect. 4.4.3). Note that a higher F1/Acc score means that the classifier is better able to distinguish between positive/negative samples, implying that the generated samples are less realistic. TenScore is the score produced by the CPD-based tensor evaluation method (see Sect. 5.4). HGEN is unable to run on the comm dataset due to issues mentioned in Sect. 5.2 Football NELL-2 Model name Deg Clust Orbit F1 Acc TenScore Deg Clust Orbit F1 Acc TenScore TENGAN-CP 0.10 0.45 0.10 0.57 0.70 0.92 0.48 0.70 0.31 0.94 0.94 0.89 TENGAN-R 1.09 1.12 0.75 0.94 0.94 0.77 0.93 0.77 0.91 0.49 0.66 2.64 HGEN 1.03 1.45 0.84 1.00 1.00 5.00 0.98 0.74 0.65 1.00 1.00 5.67 Barab&#225;si-Albert 1.07 1.50 0.60 1.00 1.00 2.93 0.81 0.31 0.30 1.00 1.00 13.16 BINBALL 1.10 1.30 0.99 0.64 0.66 11.64 0.47 0.27 0.24 0.92 0.93 3.33 StarGen 1.06 1.25 0.99 0.63 0.70 11.20 0.77 0.40 0.25 0.91 0.91 2.55 ANGEL 0.15 0.37 0.18 0.48 0.52 4.59 0.78 0.25 0.51 0.94 0.94 6.78 Model name Enron Comm Deg Clust Orbit F1 Acc TenScore Deg Clust Orbit F1 Acc TenScore TENGAN-CP 0.65 0.19 0.53 0.87 0.89 0.86 0.37 0.39 0.38 0.72 0.61 1.50 TENGAN-R 1.77 1.97 1.74 0.99 0.99 4.32 0.34 0.40 0.35 0.69 0.56 1.31 HGEN 0.59 0.02 0.06 1.00 1.00 0.90 ------Barab&#225;si-Albert 0.67 0.03 0.23 1.00 1.00 1.57 1.24 0.001 0.37 1.00 1.00 3.46 BINBALL 0.32 0.02 0.07 0.92 0.92 3.41 0.78 0.46 0.06 1.00 1.00 0.89 StarGen 0.64 0.16 0.07 0.99 0.99 4.61 0.90 0.58 0.07 1.00 1.00 1.06 ANGEL 0.58 0.01 0.19 0.98 0.98 1.34 0.99 0.08 0.20 1.00 1.00 1.14 2. NELL-2 (Carlson et al. 2010): A sampled version of the NELL-2 dataset (from Smith et al. ( <ref type="formula">2017</ref>)) that consists of (entity, relation, entity) tuples. The original size is 12, 092 &#215; 9, 184 &#215; 28, 818 , but we resample it to a 1, 000 &#215; 4 &#215; 1, 000 ten- sor (where 4 is the number of views) for the purpose of evaluation. The sampling method is described in Sect. 4.1. 3. Comm: An enterprise communication network dataset of 1,558,594 computers.</p><p>Each view corresponds to communications between nodes on one of five ports <ref type="bibr">(22, 23, 80, 443, and 445)</ref>, with one view for each port. In the view associated with port p, a directed edge from u to v exists if u initiates a connection to v over port p. Since we are unable to provide a copy of this network, descriptive statistics of each view in this network are shown below in Table <ref type="table">2</ref>. 4. Enron <ref type="bibr">(Klimt and Yang 2004)</ref>: A multiplex graph of emails sent between Enron employees, where each view represents a two-month (60-day) time interval and edges represent emails. The original tensor (available at Smith et al. ( <ref type="formula">2017</ref>)) is 6,066 senders &#215; 5,699 recipients &#215; 244,268 words &#215; 1,176 days. We collapse the words dimension and simply add an unweighted edge for each email sent in a given time interval. We also aggregate the slices so that each view represents a 60-day period to reduce the number of views. Finally, we sample 1,000 senders and 1,000 recipients using the methodology described in the supplementary material. Finally, we sample 1,000 senders and 1,000 recipients using the methodology described in Sect. 4.1.</p><p>We perform random walk sampling to extract a set of sub-multiplex-graphs from each dataset. We describe this process with more detail in Sect.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Comparison with existing methods</head><p>To the best of our knowledge, no other deep learning models for multiplex graph generation exist. The existing models are statistical and are built to match specific attributes of the underlying graph. We compare our method against BINBALL <ref type="bibr">(Basu et al. 2015)</ref>, StarGen <ref type="bibr">(F&#252;genschuh et al. 2018)</ref>, and ANGEL <ref type="bibr">(F&#252;genschuh et al. 2020)</ref>. We also adapt HGEN <ref type="bibr">(Ling et al. 2021</ref>)-a generative model for heterogeneous graphs-to work for multiplex graphs. A heterogeneous graph consists of nodes of different types and, therefore, edges of different types. An common example of this is a citation graph, where we might have nodes for authors, papers, and conferences. This is in contrast to multiplex graphs, where we have different views of the same nodes. As such, it is difficult to directly compare the two methods-however, we attempt to convert multiplex graphs to heterogeneous graphs and evaluate its performance. The HGEN code<ref type="foot">foot_1</ref> (as provided in the paper) does not support different edge types or a single node belonging to multiple classes, so we encountered the following issues (some of which may affect its performance).</p><p>1 3</p><p>TENGAN: adversarially generating multiplex tensor graphs Lack of multiplex graph support. Let n be the number of nodes and k be the number of views in our original multiplex graph. To work around the lack of support for different edge types, we create k nodes for each of the n nodes in the original graph, each with a different class in the range <ref type="bibr">[1, k]</ref>. Then, we link together each of these k nodes in the heterogeneous network. This results in a total of nk nodes across k classes in the resulting heterogeneous network.</p><p>Graph size issues. While the actual HGEN model is efficient for generation, it requires HIN node embeddings for each node in the input graph. We chose to use hin2vec <ref type="bibr">(Fu et al. 2017</ref>)-same as in the original code. However, we have nk nodes after the conversion to a HIN, causing the embeddings to take too long to calculate on some datasets (like the Comm dataset).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">MMD-based evaluation</head><p>The MMD-based evaluations compares the similarity of different graph attributes for each slice between the real and generated graphs. From Table <ref type="table">3</ref>, we can see that TENGAN-CP and TENGAN-R generally perform fairly well on the Football, NELL-2 Comm datasets. However, this is a layer-level comparison, and even BA (which treats each layer separately) performs decently well in this comparison. This is why the other evaluation methods are important, especially the tensor-based evaluation, which provides a holistic look at the generated tensors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4">TenScore evaluation</head><p>tends to perform best in terms of TenScore across all the datasets, with the exception of the Comm dataset (where BINBALL performs slightly better). TENGAN-CP has a TenScore of below 1 on the Football, NELL-2, and Enron datasets. This indicates that the mean EMD for all real-generated pairs is lower than the mean EMD for all real-real pairs in the dataset. None of the methods have a very low TenScore, which means that all of the methods exhibit a good amount of diversity comparable to that of the original data. However, some of the baseline methods have a very high TenScore, indicating that the generated graphs have a very different amount of trilinear structure from that of the input graphs.</p><p>Suprisingly, Barab&#225;si-Albert outperforms some of the other statistical methods on the Football and Enron datasets like BINBALL and StarGen. This is likely because BINBALL and StarGen focus on airport transportation networks and therefore focus on modelling behavior like hub-spoke formations <ref type="bibr">(Basu et al. 2015;</ref><ref type="bibr">F&#252;genschuh et al. 2018)</ref>. These structures are more present in the sparser NELL-2 and Comm datasets than the denser Football and Enron datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.5">Classifier-based evaluation</head><p>TENGAN-CP performs fairly well on the Football dataset, with an accuracy of 0.70 and F1 score of 0.57 (recall that the lower the accuracy, the better). TENGAN-R performs well on the NELL-2 and Comm datasets. No model does a very good job of fooling the classifier on Enron, likely due to the higher number of views.</p><p>Most of the baselines do a poor job of fooling the classifier, with many baselines resulting in the classifier having 100% accuracy. One reason is because some generators are able to model some layers extremely well, but sometimes fail to model other layers. This leads to the classifier for certain layers to have very high accuracy, making the overall classifier very accurate. For example, BINBALL randomly assigns (based on a parameter p) a layer as a BA model or an ER model. This assignment can mean that the generated results for a given layer will be significantly different from those of the original graph, causing the classifier to be very accurate on that layer.</p><p>Another reason for the poor performance of the baselines is that the majority of them are statistical models and rely on general rules (e.g. preferential attachment for BA). While this may be able to mimic some attributes, the node and graph embeddings will likely greatly differ (except in the case where the original graphs exhibit simple structure).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.6">Summary</head><p>TENGAN performs well on the majority of the datasets across all of the evaluation criteria. It performs especially well in the classifier-based and TenScore evaluations. This is likely because TENGAN learns directly from the data in contrast to most of the other baselines, which have to learn explicitly defined parameters instead. However, TENGAN-R does significantly worse than TENGAN-CP on most datasets, despite requiring more parameters. This is likely because TENGAN-R tends to have a hard time converging on the Football and Enron datasets. A possible reason for this is that the RESCAL decomposition imposes a stricter requirement on the factor matrix A since it is shared across all layers, making it difficult for the model to learn well. The hyperparameter r is also important in how well the model performs. Generally, r has to be higher for sparser tensors and lower for denser tensors. We do not carefully tune r in this paper-we select a resonable default (e.g., r = 100 ) and increase/decrease it until the model converges.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Related work</head><p>There have been several works on the topic of multiplex graph sampling. <ref type="bibr">Interdonato et al. (2020)</ref> found that methods that work on standard graphs like Metropolis-Hastings random walks, BFS, and forest fire sampling <ref type="bibr">(Leskovec and Faloutsos 2006)</ref> can also be applied to multiplex graphs. To improve random walk sampling on multiplex graphs, <ref type="bibr">Gjoka et al. (2011)</ref> proposes union multigraph sampling-a method that uses the "union multigraph", which consists of all edges across all views in the multiplex graph. Union multigraph sampling then performs a random walk over this multigraph to sample it. While unbiased samples are useful, we sometimes want a 1 3 TENGAN: adversarially generating multiplex tensor graphs biased sample to better sample nodes with special properties. <ref type="bibr">Khadangi et al. (2016)</ref> propose using learning automata to do so.</p><p>There has also been some previous work on multiplex graph generation. For example, <ref type="bibr">Nicosia et al. (2013)</ref> propose a model to grow a multiplex graph based on traditional preferential attachment models like the Barab&#225;si-Albert (BA) model <ref type="bibr">(Barab&#225;si and Albert 1999)</ref>. BinBall <ref type="bibr">(Basu et al. 2015</ref>) also builds upon the BA model and focuses on air transportation networks. StarGen <ref type="bibr">(F&#252;genschuh et al. 2018)</ref> directly improves upon BinBall by using a per-layer edge count distribution and splitting the scaling factor of a new node into global and local factors. <ref type="bibr">Kim and Goh (2013)</ref> also uses single-layer preferential attachment models and tunes the correlation between layers. ANGEL <ref type="bibr">(F&#252;genschuh et al. 2020</ref>) specifically tries to emulate the hub-and-spoke structure found in many graphs.</p><p>In recent years, neural networks have also been applied to graph generation. GraphRNN <ref type="bibr">(You et al. 2018</ref>) uses a RNN to model graphs as a sequence of nodes of edges. NetGAN <ref type="bibr">(Bojchevski et al. 2018</ref>) uses a LSTM to learn the distribution of biased random walks and reconstructs graphs from them. GraphVAE <ref type="bibr">(Kipf and Welling 2016</ref>) uses a variational autoencoder to generate graphs. LGGAN <ref type="bibr">(Fan and Huang 2019)</ref> generates the adjacency matrix directly, along with its associated labels. BRGAN <ref type="bibr">(Shiao and Papalexakis 2021)</ref> generates rank-constrained graphs by first generating factor matrices, in a similar manner to TENGAN. There have also been several models for multi-scale graphs. The key difference between a multiplex and multi-scale graph is that a multiplex graph contains the same nodes with different edges in each view, while a multi-scale graph typically contains representations of the same underlying graph at different resolutions (different number of nodes) in each layer. Misc-GAN <ref type="bibr">(Zhou et al. 2019</ref>) generates a multi-scale graph before collapsing it into a standard graph, and DMGNN <ref type="bibr">(Li et al. 2020</ref>) predicts multi-scale graphs from previous ones.</p><p>the best of our knowledge, there have been no other neural-network-based models for multiplex graph generation. HGEN <ref type="bibr">(Ling et al. 2021</ref>) allows for the deep generation of heterogeneous networks by modelling random walks over the graph with a GAN. However, their approach largely focuses on the modelling of interlayer edges with meta-paths. Heterogeneous networks refer to graphs with different node and edge types, while we focus on the case with shared nodes but different edges/views. We elaborate more on the precise definition of a multi-view graph in Sect. 2.1 above.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusion</head><p>In this work, we discuss some of the issues associated with multiplex graph generation, as well as some solutions to those issues. One of these issues is the large number of parameters required to generate a multiplex graph using a neural network. We tackle this by proposing a novel GAN-based method that leverages the CPD and RESCAL decompositions to greatly reduce the number of parameters required.</p><p>Another issue with multiplex graph generation is a lack of evaluation criteria. We address this by proposing 3 different evaluation metrics that evaluate the realism of the graph along different aspects. We also modify HGEN, a model for heterogeneous networks, to work with multiplex graphs. We run our models on 4 different datasets, compare their results against HGEN and 3 other statistical multiplex generation models, and find that we perform better on the majority of them.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"><p>https:// github. com/ wills hiao/ tengan</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1"><p>https:// github. com/ lingc hen03</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2"><p>31/ HGEN</p></note>
		</body>
		</text>
</TEI>
