NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HQAlign: aligning nanopore reads for SV detection using current-level modeling

https://doi.org/10.1093/bioinformatics/btad580

Joshi, Dhaivat; Diggavi, Suhas; Chaisson, Mark J; Kannan, Sreeram (October 2023, Bioinformatics)
Alkan, Can (Ed.)
Abstract MotivationDetection of structural variants (SVs) from the alignment of sample DNA reads to the reference genome is an important problem in understanding human diseases. Long reads that can span repeat regions, along with an accurate alignment of these long reads play an important role in identifying novel SVs. Long-read sequencers, such as nanopore sequencing, can address this problem by providing very long reads but with high error rates, making accurate alignment challenging. Many errors induced by nanopore sequencing have a bias because of the physics of the sequencing process and proper utilization of these error characteristics can play an important role in designing a robust aligner for SV detection problems. In this article, we design and evaluate HQAlign, an aligner for SV detection using nanopore sequenced reads. The key ideas of HQAlign include (i) using base-called nanopore reads along with the nanopore physics to improve alignments for SVs, (ii) incorporating SV-specific changes to the alignment pipeline, and (iii) adapting these into existing state-of-the-art long-read aligner pipeline, minimap2 (v2.24), for efficient alignments. ResultsWe show that HQAlign captures about 4%–6% complementary SVs across different datasets, which are missed by minimap2 alignments while having a standalone performance at par with minimap2 for real nanopore reads data. For the common SV calls between HQAlign and minimap2, HQAlign improves the start and the end breakpoint accuracy by about 10%–50% for SVs across different datasets. Moreover, HQAlign improves the alignment rate to 89.35% from minimap2 85.64% for nanopore reads alignment to recent telomere-to-telomere CHM13 assembly, and it improves to 86.65% from 83.48% for nanopore reads alignment to GRCh37 human genome. Availability and implementationhttps://github.com/joshidhaivat/HQAlign.git.
more » « less
Full Text Available
A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis

https://doi.org/10.1186/s12859-020-3401-5

Lin, Eugene; Mukherjee, Sudipto; Kannan, Sreeram (December 2020, BMC Bioinformatics)

Full Text Available
RefShannon: A genome-guided transcriptome assembler using sparse flow decomposition

https://doi.org/10.1371/journal.pone.0232946

Mao, Shunfu; Pachter, Lior; Tse, David; Kannan, Sreeram; Chen, Zhong-Hua (June 2020, PLOS ONE)

Full Text Available
LEARN Codes: Inventing Low-Latency Codes via Recurrent Neural Networks

https://doi.org/10.1109/JSAIT.2020.2988577

Jiang, Yihan; Kim, Hyeji; Asnani, Himanshu; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Deepcode: Feedback Codes via Deep Learning

https://doi.org/10.1109/JSAIT.2020.2986752

Kim, Hyeji; Jiang, Yihan; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Inferring Causal Gene Regulatory Networks from Coupled Single-Cell Expression Dynamics Using Scribe

https://doi.org/10.1016/j.cels.2020.02.003

Qiu, Xiaojie; Rahimzamani, Arman; Wang, Li; Ren, Bingcheng; Mao, Qi; Durham, Timothy; McFaline-Figueroa, José L.; Saunders, Lauren; Trapnell, Cole; Kannan, Sreeram (March 2020, Cell Systems)

Full Text Available
Learning in Gated Neural Networks

Makkuva, Ashok Vardhan; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (January 2020, Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS) 2020, Palermo, Italy. PMLR: Volume 108.)

Gating is a key feature in modern neural networks including LSTMs, GRUs and sparselygated deep neural networks. The backbone of such gated networks is a mixture-of-experts layer, where several experts make regression decisions and gating controls how to weigh the decisions in an input-dependent manner. Despite having such a prominent role in both modern and classical machine learning, very little is understood about parameter recovery of mixture-of-experts since gradient descent and EM algorithms are known to be stuck in local optima in such models. In this paper, we perform a careful analysis of the optimization landscape and show that with appropriately designed loss functions, gradient descent can indeed learn the parameters of a MoE accurately. A key idea underpinning our results is the design of two distinct loss functions, one for recovering the expert parameters and another for recovering the gating parameters. We demonstrate the first sample complexity results for parameter recovery in this model for any algorithm and demonstrate significant performance gains over standard loss functions in numerical experiments
more » « less
Full Text Available
C-MI-GAN : Estimation of Conditional Mutual Information Using MinMax Formulation

Mondal, Arnab; Bhattacharjee, Arnab; Mukherjee, Sudipto; Asnani, Himanshu; Kannan, Sreeram; Prathosh, AP (January 2020, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 2020)

Estimation of information theoretic quantities such as mutual information and its conditional variant has drawn interest in recent times owing to their multifaceted applications. Newly proposed neural estimators for these quantities have overcome severe drawbacks of classical kNN-based estimators in high dimensions. In this work, we focus on conditional mutual information (CMI) estimation by utilizing its formulation as a minmax optimization problem. Such a formulation leads to a joint training procedure similar to that of generative adversarial networks. We find that our proposed estimator provides better estimates than the existing approaches on a variety of simulated datasets comprising linear and non-linear relations between variables. As an application of CMI estimation, we deploy our estimator for conditional independence (CI) testing on real data and obtain better results than state-of-the-art CI testers
more » « less
Full Text Available
ClusterGAN : Latent Space Clustering in Generative Adversarial Networks

Mukherjee, Sudipto; Asnani, Himanshu; Lin, Eugene; Kannan, Sreeram (January 2019, Proceedings of the ... AAAI Conference on Artificial Intelligence)

Full Text Available
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms

Makkuva, Ashok Vardhan; Oh, Sewoong; Kannan, Sreeram; Viswanath, Pramod (January 2019, International Conference on Machine Learning)

Mixture-of-Experts (MoE) is a widely popular model for ensemble learning and is a basic building block of highly successful modern neural networks as well as a component in Gated Recurrent Units (GRU) and Attention networks. However, present algorithms for learning MoE, including the EM algorithm and gradient descent, are known to get stuck in local optima. From a theoretical viewpoint, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. While existing algorithms jointly or iteratively estimate the expert parameters and the gating parameters in the MoE, we propose a novel algorithm that breaks the deadlock and can directly estimate the expert parameters by sensing its echo in a carefully designed cross-moment tensor between the inputs and the output. Once the experts are known, the recovery of gating parameters still requires an EM algorithm; however, we show that the EM algorithm for this simplified problem, unlike the joint EM algorithm, converges to the true parameters. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines.
more » « less
Full Text Available

« Prev Next »

Search for: All records