NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scaling Generalized N-Body Problems, A Case Study from Genomics

https://doi.org/10.1145/3472456.3472517

Ellis, Marquita; Buluc, Aydin; Yelick, Katherine (August 2021, ICPP 2021: 50th International Conference on Parallel Processing)
null (Ed.)
Full Text Available
Asynchrony versus bulk-synchrony for a generalized N-body problem from genomics

https://doi.org/10.1145/3437801.3441580

Ellis, Marquita; Buluç, Aydın; Yelick, Katherine (February 2021, PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)
null (Ed.)
Full Text Available
Distributed-Memory k-mer Counting on GPUs

https://doi.org/10.1109/IPDPS49936.2021.00061

Nisa, Israt; Pandey, Prashant; Ellis, Marquita; Oliker, Leonid; Buluc, Aydin; Yelick, Katherine (May 2021, International Parallel and Distributed Processing Symposium)
null (Ed.)
Full Text Available
LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment

https://doi.org/10.1109/IPDPS47924.2020.00055

Zeni, Alberto; Guidi, Giulia; Ellis, Marquita; Ding, Nan; Santambrogio, Marco D.; Hofmeyr, Steven; Buluc, Aydin; Oliker, Leonid; Yelick, Katherine (May 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS))

Pairwise sequence alignment is one of the most computationally intensive kernels in genomic data analysis, accounting for more than 90% of the runtime for key bioinformatics applications. This method is particularly expensive for third generation sequences due to the high computational cost of analyzing sequences of length between 1Kb and 1Mb. Given the quadratic overhead of exact pairwise algorithms for long alignments, the community primarily relies on approximate algorithms that search only for high-quality alignments and stop early when one is not found. In this work, we present the first GPU optimization of the popular X-drop alignment algorithm, that we named LOGAN. Results show that our high performance multi-GPU implementation achieves up to 181.6 GCUPS and speed-ups up to 6.6 and 30.7 using 1 and 6 NVIDIA Tesla V100, respectively, over the state-of-the-art software running on two IBM Power9 processors using 168 CPU threads, with equivalent accuracy. We also demonstrate a 2.3 LOGAN speed-up versus ksw2, a state-of-art vectorized algorithm for sequence alignment implemented in minimap2, a long-read mapping software. To highlight the impact of our work on a real-world application, we couple LOGAN with a many-to-many long-read alignment software called BELLA, and demonstrate that our implementation improves the overall BELLA runtime by up to 10.6. Finally, we adapt the Roofline model for LOGAN and demonstrate that our implementation is near optimal on the NVIDIA Tesla V100s.
more » « less
Full Text Available
The parallelism motifs of genomic data analysis

https://doi.org/10.1098/rsta.2019.0394

Yelick, Katherine; Buluç, Aydın; Awan, Muaaz; Azad, Ariful; Brock, Benjamin; Egan, Rob; Ekanayake, Saliya; Ellis, Marquita; Georganas, Evangelos; Guidi, Giulia; et al (March 2020, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences)

Genomic datasets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share these data with the research community, but some of these genomic data analysis problems require large-scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high-end parallel systems today and place different requirements on programming support, software libraries and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high-performance genomics analysis, including alignment, profiling, clustering and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or ‘motifs’ that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
more » « less
Full Text Available

Search for: All records