NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CellMeSH: probabilistic cell-type identification using indexed literature

https://doi.org/10.1093/bioinformatics/btab834

Mao, Shunfu; Zhang, Yue; Seelig, Georg; Kannan, Sreeram (February 2022, Bioinformatics)
Birol, Inanc (Ed.)
Abstract MotivationSingle-cell RNA sequencing (scRNA-seq) is widely used for analyzing gene expression in multi-cellular systems and provides unprecedented access to cellular heterogeneity. scRNA-seq experiments aim to identify and quantify all cell types present in a sample. Measured single-cell transcriptomes are grouped by similarity and the resulting clusters are mapped to cell types based on cluster-specific gene expression patterns. While the process of generating clusters has become largely automated, annotation remains a laborious ad hoc effort that requires expert biological knowledge. ResultsHere, we introduce CellMeSH—a new automated approach to identifying cell types for clusters based on prior literature. CellMeSH combines a database of gene–cell-type associations with a probabilistic method for database querying. The database is constructed by automatically linking gene and cell-type information from millions of publications using existing indexed literature resources. Compared to manually constructed databases, CellMeSH is more comprehensive and is easily updated with new data. The probabilistic query method enables reliable information retrieval even though the gene–cell-type associations extracted from the literature are noisy. CellMeSH is also able to optionally utilize prior knowledge about tissues or cells for further annotation improvement. CellMeSH achieves top-one and top-three accuracies on a number of mouse and human datasets that are consistently better than existing approaches. Availability and implementationWeb server at https://uncurl.cs.washington.edu/db_query and API at https://github.com/shunfumao/cellmesh. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Scalable preprocessing for sparse scRNA-seq data exploiting prior knowledge

https://doi.org/10.1093/bioinformatics/bty293

Mukherjee, Sumit; Zhang, Yue; Fan, Joshua; Seelig, Georg; Kannan, Sreeram (June 2018, Bioinformatics)

Abstract MotivationSingle cell RNA-seq (scRNA-seq) data contains a wealth of information which has to be inferred computationally from the observed sequencing reads. As the ability to sequence more cells improves rapidly, existing computational tools suffer from three problems. (i) The decreased reads-per-cell implies a highly sparse sample of the true cellular transcriptome. (ii) Many tools simply cannot handle the size of the resulting datasets. (iii) Prior biological knowledge such as bulk RNA-seq information of certain cell types or qualitative marker information is not taken into account. Here we present UNCURL, a preprocessing framework based on non-negative matrix factorization for scRNA-seq data, that is able to handle varying sampling distributions, scales to very large cell numbers and can incorporate prior knowledge. ResultsWe find that preprocessing using UNCURL consistently improves performance of commonly used scRNA-seq tools for clustering, visualization and lineage estimation, both in the absence and presence of prior knowledge. Finally we demonstrate that UNCURL is extremely scalable and parallelizable, and runs faster than other methods on a scRNA-seq dataset containing 1.3 million cells. Availability and implementationSource code is available at https://github.com/yjzhang/uncurl_python. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Fundamental Limits of Multi-Sample Flow Graph Decomposition

https://doi.org/10.1109/ISIT50566.2022.9834518

Mazooji, Kayvon; Kannan, Sreeram; Noble, William Stafford; Shomorony, Ilan (July 2022, IEEE International Symposium on Information Theory)

Full Text Available
Interpreting neural networks for biological sequences by learning stochastic masks

https://doi.org/10.1038/s42256-021-00428-6

Linder, Johannes; La Fleur, Alyssa; Chen, Zibo; Ljubetič, Ajasja; Baker, David; Kannan, Sreeram; Seelig, Georg (January 2022, Nature Machine Intelligence)

Full Text Available
A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis

https://doi.org/10.1186/s12859-020-3401-5

Lin, Eugene; Mukherjee, Sudipto; Kannan, Sreeram (December 2020, BMC Bioinformatics)

Full Text Available
RefShannon: A genome-guided transcriptome assembler using sparse flow decomposition

https://doi.org/10.1371/journal.pone.0232946

Mao, Shunfu; Pachter, Lior; Tse, David; Kannan, Sreeram; Chen, Zhong-Hua (June 2020, PLOS ONE)

Full Text Available
LEARN Codes: Inventing Low-Latency Codes via Recurrent Neural Networks

https://doi.org/10.1109/JSAIT.2020.2988577

Jiang, Yihan; Kim, Hyeji; Asnani, Himanshu; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Deepcode: Feedback Codes via Deep Learning

https://doi.org/10.1109/JSAIT.2020.2986752

Kim, Hyeji; Jiang, Yihan; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Inferring Causal Gene Regulatory Networks from Coupled Single-Cell Expression Dynamics Using Scribe

https://doi.org/10.1016/j.cels.2020.02.003

Qiu, Xiaojie; Rahimzamani, Arman; Wang, Li; Ren, Bingcheng; Mao, Qi; Durham, Timothy; McFaline-Figueroa, José L.; Saunders, Lauren; Trapnell, Cole; Kannan, Sreeram (March 2020, Cell Systems)

Full Text Available
C-MI-GAN : Estimation of Conditional Mutual Information Using MinMax Formulation

Mondal, Arnab; Bhattacharjee, Arnab; Mukherjee, Sudipto; Asnani, Himanshu; Kannan, Sreeram; Prathosh, AP (January 2020, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 2020)

Estimation of information theoretic quantities such as mutual information and its conditional variant has drawn interest in recent times owing to their multifaceted applications. Newly proposed neural estimators for these quantities have overcome severe drawbacks of classical kNN-based estimators in high dimensions. In this work, we focus on conditional mutual information (CMI) estimation by utilizing its formulation as a minmax optimization problem. Such a formulation leads to a joint training procedure similar to that of generative adversarial networks. We find that our proposed estimator provides better estimates than the existing approaches on a variety of simulated datasets comprising linear and non-linear relations between variables. As an application of CMI estimation, we deploy our estimator for conditional independence (CI) testing on real data and obtain better results than state-of-the-art CI testers
more » « less
Full Text Available

« Prev Next »

Search for: All records