skip to main content


Title: Evaluating Multivariate Network Visualization Techniques Using a Validated Design and Crowdsourcing Approach
Visualizing multivariate networks is challenging because of the trade-offs necessary for effectively encoding network topology and encoding the attributes associated with nodes and edges. A large number of multivariate network visualization techniques exist, yet there is little empirical guidance on their respective strengths and weaknesses. In this paper, we describe a crowdsourced experiment, comparing node-link diagrams with on-node encoding and adjacency matrices with juxtaposed tables. We find that node-link diagrams are best suited for tasks that require close integration between the network topology and a few attributes. Adjacency matrices perform well for tasks related to clusters and when many attributes need to be considered. We also reflect on our method of using validated designs for empirically evaluating complex, interactive visualizations in a crowdsourced setting. We highlight the importance of training, compensation, and provenance tracking.  more » « less
Award ID(s):
1835904
NSF-PAR ID:
10209118
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
Page Range / eLocation ID:
1 to 12
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Attributed network embedding aims to learn lowdimensional vector representations for nodes in a network, where each node contains rich attributes/features describing node content. Because network topology structure and node attributes often exhibit high correlation, incorporating node attribute proximity into network embedding is beneficial for learning good vector representations. In reality, large-scale networks often have incomplete/missing node content or linkages, yet existing attributed network embedding algorithms all operate under the assumption that networks are complete. Thus, their performance is vulnerable to missing data and suffers from poor scalability. In this paper, we propose a Scalable Incomplete Network Embedding (SINE) algorithm for learning node representations from incomplete graphs. SINE formulates a probabilistic learning framework that separately models pairs of node-context and node-attribute relationships. Different from existing attributed network embedding algorithms, SINE provides greater flexibility to make the best of useful information and mitigate negative effects of missing information on representation learning. A stochastic gradient descent based online algorithm is derived to learn node representations, allowing SINE to scale up to large-scale networks with high learning efficiency. We evaluate the effectiveness and efficiency of SINE through extensive experiments on real-world networks. Experimental results confirm that SINE outperforms state-of-the-art baselines in various tasks, including node classification, node clustering, and link prediction, under settings with missing links and node attributes. SINE is also shown to be scalable and efficient on large-scale networks with millions of nodes/edges and high-dimensional node features. 
    more » « less
  2. Graphs are widely found in social network analysis and e-commerce, where Graph Neural Networks (GNNs) are the state-of the-art model. GNNs can be biased due to sensitive attributes and network topology. With existing work that learns a fair node representation or adjacency matrix, achieving a strong guarantee of group fairness while preserving prediction accuracy is still challenging, with the fairness-accuracy trade-off remaining obscure to human decision-makers. We first define and analyze a novel upper bound of group fairness to optimize the adjacency matrix for fairness without significantly h arming prediction accuracy. To understand the nuance of fairness-accuracy tradeoff, we further propose macroscopic and microscopic explanation methods to reveal the trade-offs and the space that one can exploit. The macroscopic explanation method is based on stratified sampling and linear programming to deterministically explain the dynamics of the group fairness and prediction accuracy. Driving down to the microscopic level, we propose a path-based explanation that reveals how network topology leads to the tradeoff. On seven graph datasets, we demonstrate the novel upper bound can achieve more efficient fairness-accuracy trade-offs and the intuitiveness of the explanation methods can clearly pinpoint where the trade-off is improved. 
    more » « less
  3. null (Ed.)
    Efficient provisioning of 5G network slices is a major challenge for 5G network slicing technology. Previous slice provisioning methods have only considered network resource attributes and ignored network topology attributes. These methods may result in a decrease in the slice acceptance ratio and the slice provisioning revenue. To address these issues, we propose a two-stage heuristic slice provisioning algorithm, called RT-CSP, for the 5G core network by jointly considering network resource attributes and topology attributes in this paper. The first stage of our method is called the slice node provisioning stage, in which we propose an approach to scoring and ranking nodes using network resource attributes (i.e., CPU capacity and bandwidth) and topology attributes (i.e., degree centrality and closeness centrality). Slice nodes are then provisioned according to the node ranking results. In the second stage, called the slice link provisioning stage, the k-shortest path algorithm is implemented to provision slice links. To further improve the performance of RT-CSP, we propose RT-CSP+, which uses our designed strategy, called minMaxBWUtilHops, to select the best physical path to host the slice link. The strategy minimizes the product of the maximum link bandwidth utilization of the candidate physical path and the number of hops in it to avoid creating bottlenecks in the physical path and reduce the bandwidth cost. Using extensive simulations, we compared our results with those of the state-of-the-art algorithms. The experimental results show that our algorithms increase slice acceptance ratio and improve the provisioning revenue-to-cost ratio. 
    more » « less
  4. We develop algorithms for online topology inference from streaming nodal observations and partial connectivity information; i.e., a priori knowledge on the presence or absence of a few edges may be available as in the link prediction problem. The observations are modeled as stationary graph signals generated by local diffusion dynamics on the unknown network. Said stationarity assumption implies the simultaneous diagonalization of the observations' covariance matrix and the so-called graph shift operator (GSO), here the adjacency matrix of the sought graph. When the GSO eigenvectors are perfectly obtained from the ensemble covariance, we examine the structure of the feasible set of adjacency matrices and its dependency on the prior connectivity information available. In practice one can only form an empirical estimate of the covariance matrix, so we develop an alternating algorithm to find a sparse GSO given its imperfectly estimated eigenvectors. Upon sensing new diffused observations in the streaming setting, we efficiently update eigenvectors and perform only one (or a few) online iteration(s) of the proposed algorithm until a new datum is observed. Numerical tests showcase the effectiveness of the novel batch and online algorithms in recovering real-world graphs. 
    more » « less
  5. Proc. 2023 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (Ed.)
    Representation learning on networks aims to derive a meaningful vector representation for each node, thereby facilitating downstream tasks such as link prediction, node classification, and node clustering. In heterogeneous text-rich networks, this task is more challenging due to (1) presence or absence of text: Some nodes are associated with rich textual information, while others are not; (2) diversity of types: Nodes and edges of multiple types form a heterogeneous network structure. As pretrained language models (PLMs) have demonstrated their effectiveness in obtaining widely generalizable text representations, a substantial amount of effort has been made to incorporate PLMs into representation learning on text-rich networks. However, few of them can jointly consider heterogeneous structure (network) information as well as rich textual semantic information of each node effectively. In this paper, we propose Heterformer, a Heterogeneous Network-Empowered Transformer that performs contextualized text encoding and heterogeneous structure encoding in a unified model. Specifically, we inject heterogeneous structure information into each Transformer layer when encoding node texts. Meanwhile, Heterformer is capable of characterizing node/edge type heterogeneity and encoding nodes with or without texts. We conduct comprehensive experiments on three tasks (i.e., link prediction, node classification, and node clustering) on three large-scale datasets from different domains, where Heterformer outperforms competitive baselines significantly and consistently. 
    more » « less