skip to main content

Search for: All records

Creators/Authors contains: "Tang, Xulong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available October 28, 2024
  2. Abstract Motivation

    The third-generation DNA sequencing technologies, such as Nanopore Sequencing, can operate at very high speeds and produce longer reads, which in turn results in a challenge for the computational analysis of such massive data. Nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data. Call-methylation module of Nanopolish can detect methylation based on Hidden Markov Model (HMM). However, Nanopolish is limited by the long running time of some serial and computationally expensive processes. Among these, Adaptive Banded Event Alignment (ABEA) is the most time-consuming step, and the prior work, f5c, has already parallelized and optimized ABEA on GPU. As a result, the remaining methylation score calculation part, which uses HMM to identify if a given base is methylated or not, has become the new performance bottleneck.


    This article focuses on the call-methylation module that resides in the Nanopolish package. We propose Galaxy-methyl, which parallelizes and optimizes the methylation score calculation step on GPU and then pipelines the four steps of the call-methylation module. Galaxy-methyl increases the execution concurrency across CPUs and GPUs as well as hardware resource utilization for both. The experimental results collected indicate that Galaxy-methyl can achieve 3×–5× speedup compared with Nanopolish, and reduce the total execution time by 35% compared with f5c, on average.

    Availability and implementation

    The source code of Galaxy-methyl is available at

    more » « less
  3. Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to accuracy degradation, difficulty in leveraging hardware acceleration, and/or restriction on certain types of DNN layers. In this article, we propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations that are applicable to any type of DNN layer while achieving high accuracy and hardware inference performance. With the flexibility of applying different pruning schemes to different layers enabled by our compiler optimizations, we further probe into the new problem of determining the best-suited pruning scheme considering the different acceleration and accuracy performance of various pruning schemes. Two pruning scheme mapping methods—one -search based and the other is rule based—are proposed to automatically derive the best-suited pruning regularity and block size for each layer of any given DNN. Experimental results demonstrate that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework with up to 2.48 \( \times \) and 1.73 \( \times \) DNN inference acceleration on CIFAR-10 and ImageNet datasets without accuracy loss. 
    more » « less