NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing

https://doi.org/10.1038/s41467-023-39784-9

Ni, Peng; Nie, Fan; Zhong, Zeyu; Xu, Jinrui; Huang, Neng; Zhang, Jun; Zhao, Haochen; Zou, You; Huang, Yuanfeng; Li, Jinchen; et al (July 2023, Nature Communications)

Abstract Long single-molecular sequencing technologies, such as PacBio circular consensus sequencing (CCS) and nanopore sequencing, are advantageous in detecting DNA 5-methylcytosine in CpGs (5mCpGs), especially in repetitive genomic regions. However, existing methods for detecting 5mCpGs using PacBio CCS are less accurate and robust. Here, we present ccsmeth, a deep-learning method to detect DNA 5mCpGs using CCS reads. We sequence polymerase-chain-reaction treated and M.SssI-methyltransferase treated DNA of one human sample using PacBio CCS for training ccsmeth. Using long (≥10 Kb) CCS reads, ccsmeth achieves 0.90 accuracy and 0.97 Area Under the Curve on 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves >0.90 correlations with bisulfite sequencing and nanopore sequencing using only 10× reads. Furthermore, we develop a Nextflow pipeline, ccsmethphase, to detect haplotype-aware methylation using CCS reads, and then sequence a Chinese family trio to validate it. ccsmeth and ccsmethphase can be robust and accurate tools for detecting DNA 5-methylcytosines.
more » « less
Accelerated dynamic data reduction using spatial and temporal properties

https://doi.org/10.1177/10943420231180504

Hickman Fulp, Megan; Fulp, Dakota; Zou, Changfeng; Sanders, Cooper; Biswas, Ayan; Smith, Melissa C.; Calhoun, Jon C. (September 2023, The International Journal of High Performance Computing Applications)

Due to improvements in high-performance computing (HPC) capabilities, many of today’s applications produce petabytes worth of data, causing bottlenecks within the system. Importance-based sampling methods, including our spatio-temporal hybrid data sampling method, are capable of resolving these bottlenecks. While our hybrid method has been shown to outperform existing methods, its effectiveness relies heavily on user parameters, such as histogram bins, error threshold, or number of regions. Moreover, the throughput it demonstrates must be higher to avoid becoming a bottleneck itself. In this article, we resolve both of these issues. First, we assess the effects of several user input parameters and detail techniques to help determine optimal parameters. Next, we detail and implement accelerated versions of our method using OpenMP and CUDA. Upon analyzing our implementations, we find 9.8× to 31.5× throughput improvements. Next, we demonstrate how our method can accept different base sampling algorithms and the effects these different algorithms have. Finally, we compare our sampling methods to the lossy compressor cuSZ in terms of data preservation and data movement.
more » « less
Full Text Available
Black-box statistical prediction of lossy compression ratios for scientific data

https://doi.org/10.1177/10943420231179417

Underwood, Robert; Bessac, Julie; Krasowska, David; Calhoun, Jon_C; Di, Sheng; Cappello, Franck (June 2023, The International Journal of High Performance Computing Applications)

Lossy compressors are increasingly adopted in scientific research, tackling volumes of data from experiments or parallel numerical simulations and facilitating data storage and movement. In contrast with the notion of entropy in lossless compression, no theoretical or data-based quantification of lossy compressibility exists for scientific data. Users rely on trial and error to assess lossy compression performance. As a strong data-driven effort toward quantifying lossy compressibility of scientific datasets, we provide a statistical framework to predict compression ratios of lossy compressors. Our method is a two-step framework where (i) compressor-agnostic predictors are computed and (ii) statistical prediction models relying on these predictors are trained on observed compression ratios. Proposed predictors exploit spatial correlations and notions of entropy and lossyness via the quantized entropy. We study 8+ compressors on 6 scientific datasets and achieve a median percentage prediction error less than 12%, which is substantially smaller than that of other methods while achieving at least a 8.8× speedup for searching for a specific compression ratio and 7.8× speedup for determining the best compressor out of a collection.
more » « less
Nanogel Degradation at Soft Interfaces and in Bulk: Tracking Shape Changes and Interfacial Spreading

https://doi.org/10.1021/acs.macromol.2c02470

Palkar, Vaibhav; Thakar, Devanshu; Kuksenok, Olga (February 2023, Macromolecules)

Full Text Available
Estimating Potential Error in Sampling Interpolation

https://doi.org/10.1109/BigData55660.2022.10020913

Fulp, Megan Hickman; Fulp, Dakota; Calhoun, Jon C. (December 2022, 2022 IEEE International Conference on Big Data (Big Data))

Full Text Available
Photocontrol of pattern formation and hysteresis loops in polymer gels with host-guest interactions

https://doi.org/10.1016/j.isci.2022.105606

Xiong, Yao; Kuksenok, Olga (December 2022, iScience)

Full Text Available
Exploring Data Reduction Techniques for Additive Manufacturing Analysis

https://doi.org/10.1109/DRBSD56682.2022.00008

Nichols, Coleman; Fulp, Megan Hickman; DeBardeleben, Nathan; Calhoun, Jon C. (November 2022, 2022 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7))

Full Text Available
An Efficient B-Spline Lagrangian/Eulerian Method for Compressible Flow, Shock Waves, and Fracturing Solids

https://doi.org/10.1145/3519595

Cao, Yadi; Chen, Yunuo; Li, Minchen; Yang, Yin; Zhang, Xinxin; Aanjaneya, Mridul; Jiang, Chenfanfu (October 2022, ACM Transactions on Graphics)

This study presents a new method for modeling the interaction between compressible flow, shock waves, and deformable structures, emphasizing destructive dynamics. Extending advances in time-splitting compressible flow and the Material Point Methods (MPM), we develop a hybrid Eulerian and Lagrangian/Eulerian scheme for monolithic flow-structure interactions. We adopt the second-order WENO scheme to advance the continuity equation. To stably resolve deforming boundaries with sub-cell particles, we propose a blending treatment of reflective and passable boundary conditions inspired by the theory of porous media. The strongly coupled velocity-pressure system is discretized with a new mixed-order finite element formulation employing B-spline shape functions. Shock wave propagation, temperature/density-induced buoyancy effects, and topology changes in solids are unitedly captured.
more » « less
Full Text Available
Stale Data Analysis in Intelligent Transportation Platooning Models

https://doi.org/10.1109/UEMCON54665.2022.9965630

Holt, Cavender; Calhoun, Jon C. (October 2022, 2022 IEEE 13th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON))

Full Text Available
Effect of Fluorophobic Character upon Switching Nanoparticles in Polymer Films from Aggregated to Dispersed States Using Immersion Annealing

https://doi.org/10.1021/acsapm.2c00968

Zhang, Mengxue; Larison, Taylor; Tu, Sidong; Kuksenok, Olga; Stefik, Morgan (October 2022, ACS Applied Polymer Materials)

Full Text Available

« Prev Next »

Search for: All records