Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Understanding information flow in the brain can be facilitated by arranging neurons in the fly connectome to form a maximally “feedforward” structure. This task is naturally formulated as the Minimum Feedback Arc Set (MFAS)—a well-known NP-hard problem, especially for large-scale graphs. To address this, we propose the Rocket-Crane algorithm, an efficient two-phase method for solving MFAS. In the first phase, we develop a continuous-space optimization method that rapidly generates excellent solutions. In the second phase, we refine these solutions through advanced exploration techniques that integrate randomized and heuristic strategies to effectively escape local minima. Extensive experiments demonstrate that Rocket-Crane outperforms state-of-the-art methods in terms of solution quality, scalability, and computational efficiency. On the primary benchmark—the fly connectom—our method achieved a feedforward arc set with a total forward weight of 35,459,266 (about 85$$\%$$ ), the highest among all competing methods. The algorithm is open-source and available on GitHub.more » « lessFree, publicly-accessible full text available December 1, 2026
-
Counting and listing triangles in graphs is a fundamental task in network analysis, supporting applications such as community detection, clustering coefficient computation, k-truss decomposition, and triangle centrality. We introduce the cover-edge set, a novel concept that eliminates unnecessary edges during triangle enumeration, thereby improving efficiency. This compact cover-edge set is rapidly constructed using a breadth-first search (BFS) strategy. Using this concept, we develop both sequential and parallel triangle-counting algorithms and conduct comprehensive comparisons with state-of-the-art methods. We also design a benchmarking framework to evaluate our sequential and parallel algorithms in a systematic and reproducible manner. Extensive experiments on the latest Intel Xeon 8480+ processor reveal clear performance differences among algorithms, demonstrate the benefits of various optimization strategies, and show how graph characteristics, such as diameter and degree distribution, affect algorithm performance. Our source code is available on GitHub.more » « lessFree, publicly-accessible full text available November 1, 2026
-
Community detection plays a central role in uncovering meso scale structures in networks. However, existing methods often suffer from disconnected or weakly connected clusters, undermining interpretability and robustness. Well-Connected Clusters (WCC) and Connectivity Modifier (CM) algorithms are post-processing techniques that improve the accuracy of many clustering methods. However, they are computationally prohibitive on massive graphs. In this work, we present optimized parallel implementations of WCC and CM using the HPE Chapel programming language. First, we design fast and efficient parallel algorithms that leverage Chapel’s parallel constructs to achieve substantial performance improvements and scalability on modern multicore architectures. Second, we integrate this software into Arkouda/Arachne, an open-source, high-performance framework for large-scale graph analytics. Our implementations uniquely enable well-connected community detection on massive graphs with more than 2 billion edges, providing a practical solution for connectivity-preserving clustering at web scale. For example, our implementations of WCC and CM enable community detection of the over 2-billion edge Open-Alex dataset in minutes using 128 cores, a result infeasible to compute previously.more » « lessFree, publicly-accessible full text available October 1, 2026
-
In the 1980s, high-performance computing (HPC) became another tool for research in the open (non-defense) science and engineering research communities. However, HPC came with a high price tag; the first Cray-2 machines, released in 1985, cost between $12 million and $17 million, according to the Computer History Museum, and were largely available only at government research labs or through national supercomputing centers. In the 1990s, with demand for HPC increasing due to vast datasets, more complex modeling, and the growing computational needs of scientific applications, researchers began experimenting with building HPC machines from clusters of servers running the Linux operating system. By the late 1990s, two approaches to Linux-based parallel computing had emerged: the personal computer cluster methodology that became known as Beowulf and the Roadrunner architecture aimed at a more cost-effective supercomputer. While Beowulf attracted attention because of its low cost and thereby greater accessibility, Roadrunner took a different approach. While still affordable compared to vector processors and other commercially available supercomputers, Roadrunner integrated its commodity components with specialized networking technology. Furthermore, these systems initially served different purposes. While Beowulf focused on providing affordable parallel workstations for individual researchers at NASA, Roadrunner set out to provide a multiuser system that could compete with the commercial supercomputers that dominated the market at the time. This paper analyzes the technical decisions, performance implications, and long-term influence of both approaches. Through this analysis, we can start to judge the impact of both Roadrunner and Beowulf on the development of Linux-based supercomputers.more » « lessFree, publicly-accessible full text available September 19, 2026
-
The rise of graph data in various fields calls for efficient and scalable community detection algorithms. In this paper, we present parallel implementations of two widely used algorithms: Label Propagation and Louvain, specifically designed to leverage the capabilities of Arachne, which is a Python-accessible open-source framework for large-scale graph analysis. Our implementations achieve substantial speedups over existing Python-based tools like NetworkX and igraph, which lack efficient parallelization, and are competitive with parallel frameworks such as NetworKit. Experimental results show that Arachne-based methods outperform these baselines, achieving speedups of up to 710x over NetworkX, 75x over igraph, and 12x over NetworKit. Additionally, we analyze the scalability of our implementation under varying thread counts, demonstrating how different phases contribute to overall performance gains of the parallel Louvain algorithm. Arachne, including our community detection implementation, is open-source and available at https://github.com/Bears-R-Us/arkouda-njit .more » « lessFree, publicly-accessible full text available September 15, 2026
-
Large Language Models (LLMs) are increasingly used to automate software development, yet most prior evaluations focus on functional correctness or high-level languages such as Python. As one of the first systematic explorations of LLM-assisted software performance engineering, we present a comprehensive study of LLMs’ ability to generate efficient C implementations of graph-analysis routines—code that must satisfy stringent runtime and memory constraints. This emerging field of LLM-assisted algorithm engineering holds significant promise, as these models may possess the capability to design novel approaches that improve existing algorithms and their implementations. Eight state-of-the-art models (OpenAI ChatGPT o3 and o4-mini-high, Anthropic Claude 4 Sonnet and Sonnet Extended, Google Gemini 2.5 Flash and Pro, xAI Grok 3-Think, and DeepSeek DeepThink R1) are benchmarked using two distinct approaches. The first approach evaluates the ability of LLMs to generate algorithms that outperform existing benchmarks. The second approach assesses their capability to generate graph algorithms for integration into performance-critical systems. The results show that Claude Sonnet 4 Extended achieves superior performance in ready-to-use code generation and efficiency, outperforming human-written baselines in triangle counting. Although our findings demonstrate that contemporary LLMs excel in optimizing and integrating established algorithms, the potential for these models to eventually invent transformative algorithmic techniques represents a compelling frontier for future research. We provide prompts, generated code, and measurement scripts to promote reproducible research in this rapidly evolving domain. All of the source code is available on GitHub at https://github.com/Bader-Research/LLM-triangle-counting/.more » « lessFree, publicly-accessible full text available September 15, 2026
-
Subgraph isomorphism algorithms face significant scalability bottlenecks on large-scale property graphs due to inefficient vertex-by-vertex search that requires extensive exploration of early search tree levels where pruning is minimal. We present HiPerMotif, a hybrid parallel algorithm that overcomes these limitations through edge-centric initialization. HiPerMotif first reorders pattern graphs to prioritize high-connectivity vertices, then systematically identifies and validates all possible first-edge mappings before injecting these pre-validated partial states directly at search depth 2. This approach eliminates costly early exploration while enabling natural parallelization over independent edge candidates. Comprehensive evaluation against state-of-the-art baselines (VF2-PS, VF3P, Glasgow) demonstrates up to 66x speedup on real-world networks and successful processing of massive datasets like the 150M-edge H01 human connectome that cause existing methods to fail due to memory constraints. Implemented in the open-source Arkouda/Arachne framework, HiPerMotif enables previously intractable large-scale network analysis in computational neuroscience and related domains.more » « lessFree, publicly-accessible full text available September 15, 2026
-
For fast processing of increasingly large graphs, triangle counting - a common building block of graph processing algorithms, is often performed on GPUs. However, applying massive parallelism to triangle counting is challenging due to the algorithm’s inherent irregular access patterns and workload imbalance. In this work, we propose WeTriC, a novel wedge-parallel triangle counting algorithm for GPUs, which, using fine(r)-grained parallelism through a lightweight static mapping of wedges to threads, improves load balancing and efficiency. Our theoretical analysis compares different parallelization granularities, while optimizations enhance caching, reduce work-per-intersection, and minimize overhead. Performance experiments indicate that WeTriC yields 5.63x and 4.69x speedup over optimized vertex-parallel and edge-parallel binary search triangle counting algorithms, respectively. Furthermore, we show that WeTriC consistently outperforms the state-of-the-art (i.e., on avg. 2.86x faster than Trust and 2.32x faster than GroupTC).more » « lessFree, publicly-accessible full text available August 22, 2026
-
This research presents an enhanced Graph Attention Convolutional Neural Network (GAT) tailored for the analysis of open-source package vulnerability remediation. By meticulously examining control flow graphs and implementing node centrality metrics—specifically, degree, norm, and closeness centrality—our methodology identifies and evaluates changes resulting from vulnerability fixes in nodes, thereby predicting the ramifications of dependency upgrades on application workflows. Empirical testing on diverse datasets reveals that our model challenges established paradigms in software security, showcasing its efficacy in delivering comprehensive insights into code vulnerabilities and contributing to advancements in cybersecurity practices. This study delineates a strategic framework for the development of sustainable monitoring systems and the effective remediation of vulnerabilities in open-source software.more » « lessFree, publicly-accessible full text available August 14, 2026
-
Connectomics, a subfield of neuroscience, reconstructs structural and functional brain maps at synapse-level resolution. These complex spatial maps consist of tree-like neurons interconnected by synapses. Motif analysis is a widely used method for identifying recurring subgraph patterns in connectomes. These motifs, thus, potentially represent fundamental units of information processing. However, existing computational tools often oversimplify neurons as mere nodes in a graph, disregarding their intricate morphologies. In this paper, we introduceMoMo, a novel interactive visualization framework for analyzingneuron morphology-aware motifsin large connectome graphs. First, we propose an advanced graph data structure that integrates both neuronal morphology and synaptic connectivity. This enables highly efficient, parallel subgraph isomorphism searches, allowing for interactive morphological motif queries. Second, we develop a sketch-based interface that facilitates the intuitive exploration of morphology-based motifs within our new data structure. Users can conduct interactive motif searches on state-of-the-art connectomes and visualize results as interactive 3D renderings. We present a detailed goal and task analysis for motif exploration in connectomes, incorporating neuron morphology. Finally, we evaluateMoMothrough case studies with four domain experts, who asses the tool’s usefulness and effectiveness in motif exploration, and relevance to real-world neuroscience research. The source code forMoMois availablehere.more » « lessFree, publicly-accessible full text available July 3, 2026
An official website of the United States government
