- Award ID(s):
- 1763452
- PAR ID:
- 10098996
- Date Published:
- Journal Name:
- Proc. of the 18th IEEE International Conference on Data Mining (ICDM-2018)
- Page Range / eLocation ID:
- 737 to 746
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
null (Ed.)Traditional network embedding primarily focuses on learning a continuous vector representation for each node, preserving network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily applied to the vector-format node representations for network analysis. However, the learned continuous vector representations are inefficient for large-scale similarity search, which often involves finding nearest neighbors measured by distance or similarity in a continuous vector space. In this article, we propose a search efficient binary network embedding algorithm called BinaryNE to learn a binary code for each node, by simultaneously modeling node context relations and node attribute relations through a three-layer neural network. BinaryNE learns binary node representations using a stochastic gradient descent-based online learning algorithm. The learned binary encoding not only reduces memory usage to represent each node, but also allows fast bit-wise comparisons to support faster node similarity search than using Euclidean or other distance measures. Extensive experiments and comparisons demonstrate that BinaryNE not only delivers more than 25 times faster search speed, but also provides comparable or better search quality than traditional continuous vector based network embedding methods. The binary codes learned by BinaryNE also render competitive performance on node classification and node clustering tasks. The source code of the BinaryNE algorithm is available at https://github.com/daokunzhang/BinaryNE.more » « less
-
Node embedding is the task of extracting concise and informative representations of certain entities that are connected in a network. Various real-world networks include information about both node connectivity and certain node attributes, in the form of features or time-series data. Modern representation learning techniques employ both the connectivity and attribute information of the nodes to produce embeddings in an unsupervised manner. In this context, deriving embeddings that preserve the geometry of the network and the attribute vectors would be highly desirable, as they would reflect both the topological neighborhood structure and proximity in feature space. While this is fairly straightforward to maintain when only observing the connectivity or attribute information of the network, preserving the geometry of both types of information is challenging. A novel tensor factorization approach for node embedding in attributed networks is proposed in this paper, that preserves the distances of both the connections and the attributes. Furthermore, an effective and lightweight algorithm is developed to tackle the learning task and judicious experiments with multiple state-of-the-art baselines suggest that the proposed algorithm offers significant performance improvements in downstream tasks.more » « less
-
Networked data involve complex information from multifaceted channels, including topology structures, node content, and/or node labels etc., where structure and content are often correlated but are not always consistent. A typical scenario is the citation relationships in scholarly publications where a paper is cited by others not because they have the same content, but because they share one or multiple subject matters. To date, while many network embedding methods exist to take the node content into consideration, they all consider node content as simple flat word/attribute set and nodes sharing connections are assumed to have dependency with respect to all words or attributes. In this paper, we argue that considering topic-level semantic interactions between nodes is crucial to learn discriminative node embedding vectors. In order to model pairwise topic relevance between linked text nodes, we propose topical network embedding, where interactions between nodes are built on the shared latent topics. Accordingly, we propose a unified optimization framework to simultaneously learn topic and node representations from the network text contents and structures, respectively. Meanwhile, the structure modeling takes the learned topic representations as conditional context under the principle that two nodes can infer each other contingent on the shared latent topics. Experiments on three real-world datasets demonstrate that our approach can learn significantly better network representations, i.e., 4.1% improvement over the state-of-the-art methods in terms of Micro-F1 on Cora dataset. (The source code of the proposed method is available through the github link: https:// github.com/codeshareabc/TopicalNE.)more » « less
-
null (Ed.)Attributed network embedding aims to learn low dimensional node representations by combining both the network's topological structure and node attributes. Most of the existing methods either propagate the attributes over the network structure or learn the node representations by an encoder-decoder framework. However, propagation based methods tend to prefer network structure to node attributes, whereas encoder-decoder methods tend to ignore the longer connections beyond the immediate neighbors. In order to address these limitations while enjoying the best of the two worlds, we design cross fusion layers for unsupervised attributed network embedding. Specifically, we first construct two separate views to handle network structure and node attributes, and then design cross fusion layers to allow flexible information exchange and integration between the two views. The key design goals of the cross fusion layers are three-fold: 1) allowing critical information to be propagated along the network structure, 2) encoding the heterogeneity in the local neighborhood of each node during propagation, and 3) incorporating an additional node attribute channel so that the attribute information will not be overshadowed by the structure view. Extensive experiments on three datasets and three downstream tasks demonstrate the effectiveness of the proposed method.more » « less
-
Network embedding has attracted a surge of attention in recent years. It is to learn the low-dimensional representation for nodes in a network, which benefits downstream tasks such as node classification and link prediction. Most of the existing approaches learn node representations only based on the topological structure, yet nodes are often associated with rich attributes in many real-world applications. Thus, it is important and necessary to learn node representations based on both the topological structure and node attributes. In this paper, we propose a novel deep attributed network embedding approach, which can capture the high non-linearity and preserve various proximities in both topological structure and node attributes. At the same time, a novel strategy is proposed to guarantee the learned node representation can encode the consistent and complementary information from the topological structure and node attributes. Extensive experiments on benchmark datasets have verified the effectiveness of our proposed approach.