NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On Multilinear Inequalities of Ho ̈lder-Brascamp-Lieb Type for Torsion-Free Discrete Abelian Groups

https://doi.org/10.4115/jla.2024.16.4

Christ, Michael; Christ, Michael; Demmel, James; Knight, Nicholas; Scanlon, Thomas; Yelick, Katherine (March 2024, Journal of Logic and Analysis)

Holder-Brascamp-Lieb inequalities provide upper bounds for a class of multilinear expressions, in terms of L^p norms of the functions involved. They have been extensively studied for functions defined on Euclidean spaces. Bennett-Carbery-Christ-Tao have initiated the study of these inequalities for discrete Abelian groups and, in terms of suitable data, have characterized the set of all tuples of exponents for which such an inequality holds for specified data, as the convex polyhedron defined by a particular finite set of affine inequalities. In this paper we advance the theory of such inequalities for torsion-free discrete Abelian groups in three respects.The optimal constant in any such inequality is shown to equal 1 whenever it is finite.An algorithm that computes the admissible polyhedron of exponents is developed. It is shown that nonetheless, existence of an algorithm that computes the full list of inequalitiesin the Bennett-Carbery-Christ-Tao description of the admissible polyhedron for all data,is equivalent to an affirmative solution of Hilbert's Tenth Problem over the rationals.That problem remains open.
more » « less
Full Text Available
Scalable Irregular Parallelism with GPUs: Getting CPUs Out of the Way

Chen, Yuxin; Brock, Benjamin; Porumbescu, Șerban; Buluc, Aydın; Yelick, Katherine; Owens, John D. (November 2022, International Conference for High Performance Computing Networking Storage and Analysis)

Full Text Available
Scalable Irregular Parallelism with GPUs: Getting CPUs Out of the Way

https://doi.org/10.1109/SC41404.2022.00055

Chen, Yuxin; Brock, Benjamin; Porumbescu, Șerban; Buluc, Aydın; Yelick, Katherine; Owens, John D. (November 2022, International Conference for High Performance Computing Networking Storage and Analysis)

We present Atos, a dynamic scheduling framework for multi-node-GPU systems that supports PGAS-style lightweight one-sided memory operations within and between nodes. Atos's lightweight GPU-to-GPU communication enables latency hiding and can smooth the interconnection usage for bisection-limited problems. These benefits are significant for dynamic, irregular applications that often involve fine-grained communication at unpredictable times and without predetermined patterns. Some principles for high performance: (1) do not involve the CPU in the communication control path; (2) allow GPU communication within kernels, addressing memory consistency directly rather than relying on synchronization with the CPU; (3) perform dynamic communication aggregation when interconnections have limited bandwidth. By lowering the overhead of communication and allowing it within GPU kernels, we support large, high-utilization GPU kernels but with more frequent communication. We evaluate Atos on two irregular problems: Breadth-First-Search and PageRank. Atos outperforms the state-of-the-art graph libraries Gunrock, Groute and Galois on both single-node-multi-GPU and multi-node-GPU settings.
more » « less
Full Text Available
Atos: A Task-Parallel GPU Scheduler for Graph Analytics

https://doi.org/10.1145/3545008.3545056

Chen, Yuxin; Brock, Benjamin; Porumbescu, Serban; Buluc, Aydin; Yelick, Katherine; Owens, John (August 2022, Proceedings of the 51st International Conference on Parallel Processing)

We present Atos, a task-parallel GPU dynamic scheduling framework that is especially suited to dynamic irregular applications. Compared to the dominant Bulk Synchronous Parallel (BSP) frameworks, Atos exposes additional concurrency by supporting task-parallel formulations of applications with relaxed dependencies, achieving higher GPU utilization, which is particularly significant for problems with concurrency bottlenecks. Atos also offers implicit task-parallel load balancing in addition to data-parallel load balancing, providing users the flexibility to balance between them to achieve optimal performance. Finally, Atos allows users to adapt to different use cases by controlling the kernel strategy and task-parallel granularity. We demonstrate that each of these controls is important in practice. We evaluate and analyze the performance of Atos vs. BSP on three applications: breadth-first search, PageRank, and graph coloring. Atos implementations achieve geomean speedups of 3.44x, 2.1x, and 2.77x and peak speedups of 12.8x, 3.2x, and 9.08x across three case studies, compared to a state-of-the-art BSP GPU implementation. Beyond simply quantifying the speedup, we extensively analyze the reasons behind each speedup. This deeper understanding allows us to derive general guidelines for how to select the optimal Atos configuration for different applications. Finally, our analysis provides insights for future dynamic scheduling framework designs.
more » « less
Full Text Available
Atos: A Task-Parallel GPU Scheduler for Graph Analytics

Chen, Yuxin; Brock, Benjamin; Porumbescu, Serban; Buluç, Aydın; Yelick, Katherine; Owens, John D. (August 2022, Proceedings of the International Conference on Parallel Processing)

Full Text Available
Coassembly and binning of a twenty-year metagenomic time-series from Lake Mendota

https://doi.org/10.1038/s41597-024-03826-8

Oliver, Tiffany; Varghese, Neha; Roux, Simon; Schulz, Frederik; Huntemann, Marcel; Clum, Alicia; Foster, Brian; Foster, Bryce; Riley, Robert; LaButti, Kurt; et al (September 2024, Scientific Data)

Abstract The North Temperate Lakes Long-Term Ecological Research (NTL-LTER) program has been extensively used to improve understanding of how aquatic ecosystems respond to environmental stressors, climate fluctuations, and human activities. Here, we report on the metagenomes of samples collected between 2000 and 2019 from Lake Mendota, a freshwater eutrophic lake within the NTL-LTER site. We utilized the distributed metagenome assembler MetaHipMer to coassemble over 10 terabases (Tbp) of data from 471 individual Illumina-sequenced metagenomes. A total of 95,523,664 contigs were assembled and binned to generate 1,894 non-redundant metagenome-assembled genomes (MAGs) with ≥50% completeness and ≤10% contamination. Phylogenomic analysis revealed that the MAGs were nearly exclusively bacterial, dominated by Pseudomonadota (Proteobacteria, N = 623) and Bacteroidota (N = 321). Nine eukaryotic MAGs were identified by eukCC with six assigned to the phylum Chlorophyta. Additionally, 6,350 high-quality viral sequences were identified by geNomad with the majority classified in the phylum Uroviricota. This expansive coassembled metagenomic dataset provides an unprecedented foundation to advance understanding of microbial communities in freshwater ecosystems and explore temporal ecosystem dynamics.
more » « less
Scaling Generalized N-Body Problems, A Case Study from Genomics

https://doi.org/10.1145/3472456.3472517

Ellis, Marquita; Buluc, Aydin; Yelick, Katherine (August 2021, ICPP 2021: 50th International Conference on Parallel Processing)
null (Ed.)
Full Text Available
Asynchrony versus bulk-synchrony for a generalized N-body problem from genomics

https://doi.org/10.1145/3437801.3441580

Ellis, Marquita; Buluç, Aydın; Yelick, Katherine (February 2021, PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)
null (Ed.)
Full Text Available
Accelerating large scale de novo metagenome assembly using GPUs

https://doi.org/10.1145/3458817.3476212

Awan, Muaaz Gul; Hofmeyr, Steven; Egan, Rob; Ding, Nan; Buluc, Aydin; Deslippe, Jack; Oliker, Leonid; Yelick, Katherine (November 2021, The International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’21))

Full Text Available
Distributed-memory parallel algorithms for sparse times tall-skinny-dense matrix multiplication

https://doi.org/10.1145/3447818.3461472

Selvitopi, Oguz; Brock, Benjamin; Nisa, Israt; Tripathy, Alok; Yelick, Katherine; Buluç, Aydın (June 2021, ICS '21: Proceedings of the ACM International Conference on Supercomputing)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records