NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Excvate: Spoofing Exceptions and Solving Constraints to Test Exception Handling in Numerical Libraries

https://doi.org/10.1109/ARITH64983.2025.00026

Vanover, Jackson; Demmel, James; Li, Xiaoye Sherry; Rubio-González, Cindy (May 2025, IEEE)

Free, publicly-accessible full text available May 4, 2026
GSOFA: Scalable Sparse Symbolic LU Factorization on GPUs

https://doi.org/10.1109/TPDS.2021.3090316

Gaihre, Anil; Li, Xiaoye S; Liu, Hang (April 2022, IEEE transactions on parallel and distributed systems)

Full Text Available
Proposed Consistent Exception Handling for the BLAS and LAPACK

https://doi.org/10.1109/Correctness56720.2022.00006

Demmel, James; Dongarra, Jack; Gates, Mark; Henry, Greg; Langou, Julien; Li, Xiaoye; Luszczek, Piotr; Pereira, Weslley; Riedy, Jason; Rubio-Gonzalez, Cindy (November 2022, In Sixth International Workshop on Software Correctness for HPC Applications (Correctness 2022)}, a workshop of ACM/IEEE SC 2022 Conference (SC'22), Dallas, TX, USA, November 13-18, 2022.)

Numerical exceptions, which may be caused by overflow, operations like division by 0 or sqrt(−1), or convergence failures, are unavoidable in many cases, in particular when software is used on unforeseen and difficult inputs. As more aspects of society become automated e.g., self-driving cars, health monitors, and cyber-physical systems more generally, it is becoming increasingly important to design software that is resilient to exceptions, and that responds to them in a consistent way. Consistency is needed to allow users to build higher-level software that is also resilient and consistent (and so on recursively). In this paper we explore the design space of consistent exception handling for the widely used BLAS and LAPACK linear algebra libraries, pointing out a variety of instances of inconsistent exception handling in the current versions, and propose a new design that balances consistency, complexity, ease of use, and performance. Some compromises are needed, because there are preexisting inconsistencies that are outside our control, including in or between existing vendor BLAS implementations, different programming languages, and even compilers for the same programming language. And user requests from our surveys are quite diverse. We also propose our design as a possible model for other numerical software, and welcome comments on our design choices.
more » « less
Full Text Available
Dr. Top-k: Delegate-Centric Top-k on GPUs

https://doi.org/10.1145/3458817.3476141

Gaihre, Anil; Zheng, Da; Weitze, Scott; Li, Lingda; Song, Shuaiwen; Ding, Caiwen; Li, Xiaoye; Liu, Hang (November 2021, The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 21))

Full Text Available
TRUST: Triangle Counting Reloaded on GPUs

https://doi.org/10.1109/TPDS.2021.3064892

Pandey, Santosh; Wang, Zhibin; Zhong, Sheng; Tian, Chen; Zheng, Bolong; Li, Xiaoye; Li, Lingda; Hoisie, Adolfy; Ding, Caiwen; Li, Dong; et al (October 2021, IEEE transactions on parallel and distributed systems)

Full Text Available
C-SAW: a framework for graph sampling and random walk on GPUs

Pandey, Santosh; Li, Lingda; Hoisie, Adolfy; Li, Xiaoye S.; Liu, Hang (November 2020, SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis)
null (Ed.)
Full Text Available
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems

https://doi.org/10.1016/j.jpdc.2019.03.004

Sao, Piyush; Li, Xiaoye S.; Vuduc, Richard (September 2019, Journal of Parallel and Distributed Computing)

Full Text Available
A communication-avoiding 3D sparse triangular solver

https://doi.org/10.1145/3330345.3330357

Sao, Piyush; Kannan, Ramakrishnan; Li, Xiaoye Sherry; Vuduc, Richard (January 2019, ACM International Conference on Supercomputing)

Full Text Available
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices

Sao, Piyush; Li, Xiaoye; Vuduc, Richard (May 2018, Proceedings - IEEE International Parallel and Distributed Processing Symposium)

We propose a new algorithm to improve the strong scalability of right-looking sparse LU factorization on distributed memory systems. Our 3D sparse LU algorithm uses a three-dimensional PI process grid, aggressively exploits elimination tree parallelism and trades off increased memory for reduced per-process communication. We also analyze the asymptotic improvements for planar graphs (e.g., from 2D grid or mesh domains) and certain non-planar graphs (specifically for 3D grids and meshes). For planar graphs with n vertices, our algorithm reduces communication volume asymptotically in n by a factor of O(sqrt(logn)) and latency by a factor of O(logn). For non-planar cases, our algorithm can reduce the per-process communication volume by 3× and latency by O(n^1/3) times. In all cases, the memory needed to achieve these gains is a constant factor. We implemented our algorithm by extending the 2D data structure used in SuperLU_DIST. Our new 3D code achieves speedups up to 27× for planar graphs and up to 3.3× for non-planar graphs over the baseline 2D SuperLU_DIST when run on 24,000 cores of a Cray XC30.
more » « less
Full Text Available

Search for: All records