skip to main content


This content will become publicly available on July 1, 2024

Title: Privacy and anonymity for multilayer networks: A reflection
AbstractÐPrivacy of data as well as providing anonymization of data for various kinds of analysis have been addressed in the context of tabular transactional data which was mainstream. With the advent of the Internet and social networks, there is an emphasis on using different kinds of graphs for modeling and analysis. In addition to single graphs, the use of MultiLayer Networks (or MLNs) for modeling and analysis is becoming popular for complex data having multiple types of entities and relationships. They provide a better understanding of data as well as flexibility and efficiency of analysis. In this article, we understand the provenance of data privacy and some of the thinking on extending it to graph data models. We will focus on the issues of data privacy for models that are different from traditional data models and discuss alternatives. We will also consider privacy from a visualization perspective as we have developed a community Dashboard for MLN generation, analysis, and visualization based on our research.  more » « less
Award ID(s):
2120393
NSF-PAR ID:
10447239
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEEBigDataService
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Human mobility analysis plays a crucial role in urban analysis, city planning, epidemic modeling, and even understanding neighborhood effects on individuals’ health. Often, these studies model human mobility in the form of co-location networks. We have recently seen the tremendous success of network representation learning models on several machine learning tasks on graphs. To the best of our knowledge, limited attention has been paid to identifying communities using network representation learning methods specifically for co-location networks. We attempt to address this problem and study user mobility behavior through the communities identified with latent node representations. Specifically, we select several diverse network representation learning models to identify communities from a real-world co-location network. We include both general-purpose representation models that make no assumptions on network modality as well as approaches designed specifically for human mobility analysis. We evaluate these different methods on data collected in the Adolescent Health and Development in Context study. Our experimental analysis reveals that a recently proposed method (LocationTrails) offers a competitive advantage over other methods with respect to its ability to represent and reflect community assignment that is consistent with extant findings regarding neighborhood racial and socio-economic differences in mobility patterns. We also compare the learned activity profiles of individuals by factoring in their residential neighborhoods. Our analysis reveals a significant contrast in the activity profiles of individuals residing in white-dominated versus black-dominated neighborhoods and advantaged versus disadvantaged neighborhoods in a major metropolitan city of United States. We provide a clear rationale for this contrastive pattern through insights from the sociological literature.

     
    more » « less
  2. In graph machine learning, data collection, sharing, and analysis often involve multiple parties, each of which may require varying levels of data security and privacy. To this end, preserving privacy is of great importance in protecting sensitive information. In the era of big data, the relationships among data entities have become unprecedentedly complex, and more applications utilize advanced data structures (i.e., graphs) that can support network structures and relevant attribute information. To date, many graph-based AI models have been proposed (e.g., graph neural networks) for various domain tasks, like computer vision and natural language processing. In this paper, we focus on reviewing privacypreserving techniques of graph machine learning. We systematically review related works from the data to the computational aspects. We rst review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information (e.g., graph model parameters) to realize the optimization-based computation when data sharing among multiple parties is risky or impossible. In addition to discussing relevant theoretical methodology and software tools, we also discuss current challenges and highlight several possible future research opportunities for privacy-preserving graph machine learning. Finally, we envision a uni ed and comprehensive secure graph machine learning system. 
    more » « less
  3. null (Ed.)
    Situational awareness provides the decision making capability to identify, process, and comprehend big data. In our approach, situational awareness is achieved by integrating and analyzing multiple aspects of data using stacked bar graphs and geographic representations of the data. We provide a data visualization tool to represent COVID pandemic data on top of the geographical information. The combination of geospatial and temporal data provides the information needed to conduct situational analysis for the COVID-19 pandemic. By providing interactivity, geographical maps can be viewed from different perspectives and offer insight into the dynamical aspects of the COVID-19 pandemic for the fifty states in the USA. We have overlaid dynamic information on top of a geographical representation in an intuitive way for decision making. We describe how modeling and simulation of data increase situational awareness, especially when coupled with immersive virtual reality interaction. This paper presents an immersive virtual reality (VR) environment and mobile environment for data visualization using Oculus Rift head-mounted display and smartphones. This work combines neural network predictions with human-centric situational awareness and data analytics to provide accurate, timely, and scientific strategies in combatting and mitigating the spread of the coronavirus pandemic. Testing and evaluation of the data visualization tool have been done with real-time feed of COVID pandemic data set for immersive environment, non-immersive environment, and mobile environment. 
    more » « less
  4. With the increasing availability of GPS trajectory data, map construction algorithms have been developed that automatically construct road maps from this data. In order to assess the quality of such (constructed) road maps, the need for meaningful road map comparison algorithms becomes increasingly important. Indeed, different approaches for map comparison have been recently proposed; however, most of these approaches assume that the road maps are modeled as undirected embedded planar graphs. In this paper, we study map comparison algorithms for more realistic models of road maps: directed roads as well as weighted roads. In particular, we address two main questions: how close are the graphs to each other, and how close is the information presented by the graphs (i.e., traffic times, trajectories, and road type)? We propose new road network comparisons and give illustrative examples. Furthermore, our approaches do not only apply to road maps but can be used to compare other kinds of graphs as well. 
    more » « less
  5. Abstract Motivation

    The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases or co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug–drug interactions, in which activity of one drug may change, favorably or unfavorably, if taken with another drug. The knowledge of drug interactions is often limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing. Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality and morbidity.

    Results

    Here, we present Decagon, an approach for modeling polypharmacy side effects. The approach constructs a multimodal graph of protein–protein interactions, drug–protein target interactions and the polypharmacy side effects, which are represented as drug–drug interactions, where each side effect is an edge of a different type. Decagon is developed specifically to handle such multimodal graphs with a large number of edge types. Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks. Unlike approaches limited to predicting simple drug–drug interaction values, Decagon can predict the exact side effect, if any, through which a given drug combination manifests clinically. Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69%. We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients. Furthermore, Decagon models particularly well polypharmacy side effects that have a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types. Decagon opens up opportunities to use large pharmacogenomic and patient population data to flag and prioritize polypharmacy side effects for follow-up analysis via formal pharmacological studies.

    Availability and implementation

    Source code and preprocessed datasets are at: http://snap.stanford.edu/decagon.

     
    more » « less