skip to main content

Search for: All records

Creators/Authors contains: "Halappanavar, Mahantesh"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 30, 2024
  2. Label Propagation is not only a well-known machine learning algorithm for classification, but it is also an effective method for discovering communities and connected components in networks. We propose a new Direction-Optimizing Label Propagation Algorithm (DOLPA) framework that enhances the performance of the standard Label Propagation Algorithm (LPA), increases its scalability, and extends its versatility and application scope. As a central feature, the DOLPA framework relies on the use of frontiers and alternates between label push and label pull operations to attain high performance. It is formulated in such a way that the same basic algorithm can be used for finding communities or connected components in graphs by only changing the objective function used. Additionally, DOLPA has parameters for tuning the processing order of vertices in a graph to reduce the number of edges visited and improve the quality of solution obtained. We present the design and implementation of the enhanced algorithm as well as our shared-memory parallelization of it using OpenMP. We also present an extensive experimental evaluation of our implementations using the LFR benchmark and real-world networks drawn from various domains. Compared with an implementation of LPA for community detection available in a widely used network analysis software, we achieve at most five times the F-Score while maintaining similar runtime for graphs with overlapping communities. We also compare DOLPA against an implementation of the Louvain method for community detection using the same LFR-graphs and show that DOLPA achieves about three times the F-Score at just 10% of the runtime. For connected component decomposition, our algorithm achieves orders of magnitude speedups over the basic LP-based algorithm on large diameter graphs, up to 13.2 × speedup over the Shiloach-Vishkin algorithm, and up to 1.6 × speedup over Afforest on an Intel Xeon processor using 40 threads. 
    more » « less
  3. Influence maximization aims to select k most-influential vertices or seeds in a network, where influence is defined by a given diffusion process. Although computing optimal seed set is NP-Hard, efficient approximation algorithms exist. However, even state-of-the-art parallel implementations are limited by a sampling step that incurs large memory footprints. This in turn limits the problem size reach and approximation quality. In this work, we study the memory footprint of the sampling process collecting reverse reachability information in the IMM (Influence Maximization via Martingales) algorithm over large real-world social networks. We present a memory-efficient optimization approach (called HBMax) based on Ripples, a state-of-the-art multi-threaded parallel influence maximization solution. Our approach, HBMax, uses a portion of the reverse reachable (RR) sets collected by the algorithm to learn the characteristics of the graph. Then, it compresses the intermediate reverse reachability information with Huffman coding or bitmap coding, and queries on the partially decoded data, or directly on the compressed data to preserve the memory savings obtained through compression. Considering a NUMA architecture, we scale up our solution on 64 CPU cores and reduce the memory footprint by up to 82.1% with average 6.3% speedup (encoding overhead is offset by performance gain from memory reduction) without loss of accuracy. For the largest tested graph Twitter7 (with 1.4 billion edges), HBMax achieves 5.9× compression ratio and 2.2× speedup. 
    more » « less
  4. We study the problem of designing scalable algorithms to find effective intervention strategies for controlling stochastic epidemic processes on networks. This is a common problem arising in agent based models for epidemic spread.Previous approaches to this problem focus on either heuristics with no guarantees or approximation algorithms that scale only to networks corresponding to county-sized populations, typically, with less than a million nodes. In particular, the mathematical-programming based approaches need to solve the Linear Program (LP) relaxation of the problem using an LP solver, which restricts the scalability of this approach. In this work, we overcome this restriction by designing an algorithm that adapts the multiplicative weights update (MWU) framework, along with the sample average approximation (SAA) technique, to approximately solve the linear program (LP) relaxation for the problem. To scale this approach further, we provide a memory-efficient algorithm that enables scaling to large networks, corresponding to country-size populations, with over 300 million nodes and 30 billion edges. Furthermore, we show that this approach provides near-optimal solutions to the LP in practice.

    more » « less