scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data

Song, Dongyuan; Xi, Nan_Miles (ORCID:0000000331268838); Li, Jingyi_Jessica (ORCID:0000000292885648); Wang, Lin (ORCID:0000000308886232); Vitek, ed., Olga

doi:10.1093/bioinformatics/btac271

Citation Details

scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data

Abstract SummaryThe number of cells measured in single-cell transcriptomic data has grown fast in recent years. For such large-scale data, subsampling is a powerful and often necessary tool for exploratory data analysis. However, the easiest random subsampling is not ideal from the perspective of preserving rare cell types. Therefore, diversity-preserving subsampling is required for fast exploration of cell types in a large-scale dataset. Here, we propose scSampler, an algorithm for fast diversity-preserving subsampling of single-cell transcriptomic data. Availability and implementationscSampler is implemented in Python and is published under the MIT source license. It can be installed by “pip install scsampler” and used with the Scanpy pipline. The code is available on GitHub: https://github.com/SONGDONGYUAN1994/scsampler. An R interface is available at: https://github.com/SONGDONGYUAN1994/rscsampler. Supplementary informationSupplementary data are available at Bioinformatics online. more »

Award ID(s):: 1846216 2113754

PAR ID:: 10400669

Author(s) / Creator(s):: Song, Dongyuan; Xi, Nan_Miles; Li, Jingyi_Jessica; Wang, Lin; Vitek, ed., Olga

Publisher / Repository:: Oxford University Press

Date Published:: 2022-04-15

Journal Name:: Bioinformatics

Volume:: 38

Issue:: 11

ISSN:: 1367-4803

Format(s):: Medium: X Size: p. 3126-3127

Size(s):: p. 3126-3127

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/bioinformatics/btac271

More Like this