Abstract Generating high-quality synthetic networks with realistic community structure is vital to effectively evaluate community detection algorithms. In this study, we propose a new synthetic network generator called the Edge-Connected Stochastic Block Model (EC-SBM). The goal of EC-SBM is to take a given clustered real-world network and produce a synthetic network that resembles the clustered real-world network with respect to both network and community-specific criteria. In particular, we focus on simulating the internal edge connectivity of the clusters in the reference clustered network. Our performance study on large real-world networks shows that EC-SBM is generally more accurate with respect to network and community criteria than currently used approaches for this problem. Furthermore, we demonstrate that EC-SBM can complete analyses on several real-world networks with millions of nodes.
more »
« less
Stochastic blockmodeling for learning the structure of optimization problems
Abstract Decomposition‐based solution algorithms for optimization problems depend on the underlying latent block structure of the problem. Methods for detecting this structure are currently lacking. In this article, we propose stochastic blockmodeling (SBM) as a systematic framework for learning the underlying block structure in generic optimization problems. SBM is a generative graph model in which nodes belong to some blocks and the interconnections among the nodes are stochastically dependent on their block affiliations. Hence, through parametric statistical inference, the interconnection patterns underlying optimization problems can be estimated. For benchmark optimization problems, we show that SBM can reveal the underlying block structure and that the estimated blocks can be used as the basis for decomposition‐based solution algorithms which can reach an optimum or bound estimates in reduced computational time. Finally, we present a general software platform for automated block structure detection and decomposition‐based solution following distributed and hierarchical optimization approaches.
more »
« less
- Award ID(s):
- 1926303
- PAR ID:
- 10370165
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- AIChE Journal
- Volume:
- 68
- Issue:
- 6
- ISSN:
- 0001-1541
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Partitioning networks into communities of densely connected nodes is an important tool used widely across different applications, with numerous methods and software packages available for community detection. Modularity-based methods require parameters to be selected (or assume defaults) to control the resolution and, in multilayer networks, interlayer coupling. Meanwhile, most useful algorithms are heuristics yielding different near-optimal results upon repeated runs (even at the same parameters). To address these difficulties, we combine recent developments into a simple-to-use framework for pruning a set of partitions to a subset that are self-consistent by an equivalence with the objective function for inference of a degree-corrected planted partition stochastic block model (SBM). Importantly, this combined framework reduces some of the problems associated with the stochasticity that is inherent in the use of heuristics for optimizing modularity. In our examples, the pruning typically highlights only a small number of partitions that are fixed points of the corresponding map on the set of somewhere-optimal partitions in the parameter space. We also derive resolution parameter upper bounds for fitting a constrained SBM ofKblocks and demonstrate that these bounds hold in practice, further guiding parameter space regions to consider. With publicly available code (http://github.com/ragibson/ModularityPruning), our pruning procedure provides a new baseline for using modularity-based community detection in practice.more » « less
-
Abstract The reduction of a large‐scale symmetric linear discrete ill‐posed problem with multiple right‐hand sides to a smaller problem with a symmetric block tridiagonal matrix can easily be carried out by the application of a small number of steps of the symmetric block Lanczos method. We show that the subdiagonal blocks of the reduced problem converge to zero fairly rapidly with increasing block number. This quick convergence indicates that there is little advantage in expressing the solutions of discrete ill‐posed problems in terms of eigenvectors of the coefficient matrix when compared with using a basis of block Lanczos vectors, which are simpler and cheaper to compute. Similarly, for nonsymmetric linear discrete ill‐posed problems with multiple right‐hand sides, we show that the solution subspace defined by a few steps of the block Golub–Kahan bidiagonalization method usually can be applied instead of the solution subspace determined by the singular value decomposition of the coefficient matrix without significant, if any, reduction of the quality of the computed solution.more » « less
-
Abstract Process control and optimization have been widely used to solve decision-making problems in chemical engineering applications. However, identifying and tuning the best solution algorithm is challenging and time-consuming. Machine learning tools can be used to automate these steps by learning the behavior of a numerical solver from data. In this paper, we discuss recent advances in (i) the representation of decision-making problems for machine learning tasks, (ii) algorithm selection, and (iii) algorithm configuration for monolithic and decomposition-based algorithms. Finally, we discuss open problems related to the application of machine learning for accelerating process optimization and control.more » « less
-
Community detection in the human connectome: Method types, differences and their impact on inferenceAbstract Community structure is a fundamental topological characteristic of optimally organized brain networks. Currently, there is no clear standard or systematic approach for selecting the most appropriate community detection method. Furthermore, the impact of method choice on the accuracy and robustness of estimated communities (and network modularity), as well as method‐dependent relationships between network communities and cognitive and other individual measures, are not well understood. This study analyzed large datasets of real brain networks (estimated from resting‐state fMRI from = 5251 pre/early adolescents in the adolescent brain cognitive development [ABCD] study), and = 5338 synthetic networks with heterogeneous, data‐inspired topologies, with the goal to investigate and compare three classes of community detection methods: (i) modularity maximization‐based (Newman and Louvain), (ii) probabilistic (Bayesian inference within the framework of stochastic block modeling (SBM)), and (iii) geometric (based on graph Ricci flow). Extensive comparisons between methods and their individual accuracy (relative to the ground truth in synthetic networks), and reliability (when applied to multiple fMRI runs from the same brains) suggest that the underlying brain network topology plays a critical role in the accuracy, reliability and agreement of community detection methods. Consistent method (dis)similarities, and their correlations with topological properties, were estimated across fMRI runs. Based on synthetic graphs, most methods performed similarly and had comparable high accuracy only in some topological regimes, specifically those corresponding to developed connectomes with at least quasi‐optimal community organization. In contrast, in densely and/or weakly connected networks with difficult to detect communities, the methods yielded highly dissimilar results, with Bayesian inference within SBM having significantly higher accuracy compared to all others. Associations between method‐specific modularity and demographic, anthropometric, physiological and cognitive parameters showed mostly method invariance but some method dependence as well. Although method sensitivity to different levels of community structure may in part explain method‐dependent associations between modularity estimates and parameters of interest, method dependence also highlights potential issues of reliability and reproducibility. These findings suggest that a probabilistic approach, such as Bayesian inference in the framework of SBM, may provide consistently reliable estimates of community structure across network topologies. In addition, to maximize robustness of biological inferences, identified network communities and their cognitive, behavioral and other correlates should be confirmed with multiple reliable detection methods.more » « less
An official website of the United States government
