skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hi-C chromosome conformation capture sequencing of avian genomes using the BGISEQ-500 platform
Abstract Background Hi-C experiments couple DNA-DNA proximity with next-generation sequencing to yield an unbiased description of genome-wide interactions. Previous methods describing Hi-C experiments have focused on the industry-standard Illumina sequencing. With new next-generation sequencing platforms such as BGISEQ-500 becoming more widely available, protocol adaptations to fit platform-specific requirements are useful to give increased choice to researchers who routinely generate sequencing data. Results We describe an in situ Hi-C protocol adapted to be compatible with the BGISEQ-500 high-throughput sequencing platform. Using zebra finch (Taeniopygia guttata) as a biological sample, we demonstrate how Hi-C libraries can be constructed to generate informative data using the BGISEQ-500 platform, following circularization and DNA nanoball generation. Our protocol is a modification of an Illumina-compatible method, based around blunt-end ligations in library construction, using un-barcoded, distally overhanging double-stranded adapters, followed by amplification using indexed primers. The resulting libraries are ready for circularization and subsequent sequencing on the BGISEQ series of platforms and yield data similar to what can be expected using Illumina-compatible approaches. Conclusions Our straightforward modification to an Illumina-compatible in situHi-C protocol enables data generation on the BGISEQ series of platforms, thus expanding the options available for researchers who wish to utilize the powerful Hi-C techniques in their research.  more » « less
Award ID(s):
2019745
PAR ID:
10246721
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
GigaScience
Volume:
9
Issue:
8
ISSN:
2047-217X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Next-generation sequencing technologies, such as Nanopore MinION, Illumina Hiseq and Novaseq, and PacBio Sequel II, hold immense potential for advancing genomic research on non-model organisms, including the vast majority of marine species. However, application of these technologies to marine invertebrate species is often impeded by challenges in extracting and purifying their genomic DNA due to high polysaccharide content and other secondary metabolites. In this study, we help resolve this issue by developing and testing DNA extraction protocols for Kellet’s whelk (Kelletia kelletii), a subtidal gastropod with ecological and commercial importance, by comparing four DNA extraction methods commonly used in marine invertebrate studies. In our comparison of extraction methods, the Salting Out protocol was the least expensive, produced the highest DNA yields, produced consistent high DNA quality, and had low toxicity. We validated the protocol using an independent set of tissue samples, then applied it to extract high-molecular-weight (HMW) DNA from over three thousand Kellet’s whelk tissue samples. The protocol demonstrated scalability and, with added clean-up, suitability for RAD-seq, GT-seq, as well as whole genome sequencing using both long read (ONT MinION) and short read (Illumina NovaSeq) sequencing platforms. Our findings offer a robust and versatile DNA extraction and clean-up protocol for supporting genomic research on non-model marine organisms, to help mediate the under-representation of invertebrates in genomic studies. 
    more » « less
  2. High-throughput short-read sequencing has taken on a central role in research and diagnostics. Hundreds of different assays take advantage of Illumina short-read sequencers, the predominant short-read sequencing technology available today. Although other short-read sequencing technologies exist, the ubiquity of Illumina sequencers in sequencing core facilities and the high capital costs of these technologies have limited their adoption. Among a new generation of sequencing technologies, Oxford Nanopore Technologies (ONT) holds a unique position because the ONT MinION, an error-prone long-read sequencer, is associated with little to no capital cost. Here we show that we can make short-read Illumina libraries compatible with the ONT MinION by using the rolling circle to concatemeric consensus (R2C2) method to circularize and amplify the short library molecules. This results in longer DNA molecules containing tandem repeats of the original short library molecules. This longer DNA is ideally suited for the ONT MinION, and after sequencing, the tandem repeats in the resulting raw reads can be converted into high-accuracy consensus reads with similar error rates to that of the Illumina MiSeq. We highlight this capability by producing and benchmarking RNA-seq, ChIP-seq, and regular and target-enriched Tn5 libraries. We also explore the use of this approach for rapid evaluation of sequencing library metrics by implementing a real-time analysis workflow. 
    more » « less
  3. Abstract In light of the current biodiversity crisis, molecular barcoding has developed into an irreplaceable tool. Barcoding has been considerably simplified by developments in high throughput sequencing technology, but still can be prohibitively expensive and laborious when community samples of thousands of specimens need to be processed. Here, we outline an Illumina amplicon sequencing approach to generate multilocus data from large collections of arthropods. We reduce cost and effort up to 50-fold, by combining multiplex PCRs and DNA extractions from pools of presorted and morphotyped specimens and using two levels of sample indexing. We test our protocol by generating a comprehensive, community wide dataset of barcode sequences for several thousand Hawaiian arthropods from 14 orders, which were collected across the archipelago using various trapping methods. We explore patterns of diversity across the Archipelago and compare the utility of different arthropod trapping methods for biodiversity explorations on Hawaii, highlighting undergrowth beating as highly efficient method. Moreover, we show the effects of barcode marker, taxonomy and relative biomass of the targeted specimens and sequencing coverage on taxon recovery. Our protocol enables rapid and inexpensive explorations of diversity patterns and the generation of multilocus barcode reference libraries across whole ecosystems. 
    more » « less
  4. null (Ed.)
    Abstract Background Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning have enabled the application of Deep Neural Networks to variant calling, surpassing the accuracy of classical approaches in many settings. DeepVariant, arguably the most popular among such methods, transforms the problem of variant calling into one of image recognition where a Deep Neural Network analyzes sequencing data that is formatted as images, achieving high accuracy. In this paper, we explore an alternative approach to designing Deep Neural Networks for variant calling, where we use meticulously designed Deep Neural Network architectures and customized variant inference functions that account for the underlying nature of sequencing data instead of converting the problem to one of image recognition. Results Results from 27 whole-genome variant calling experiments spanning Illumina, PacBio and hybrid Illumina-PacBio settings suggest that our method allows vastly smaller Deep Neural Networks to outperform the Inception-v3 architecture used in DeepVariant for indel and substitution-type variant calls. For example, our method reduces the number of indel call errors by up to 18%, 55% and 65% for Illumina, PacBio and hybrid Illumina-PacBio variant calling respectively, compared to a similarly trained DeepVariant pipeline. In these cases, our models are between 7 and 14 times smaller. Conclusions We believe that the improved accuracy and problem-specific customization of our models will enable more accurate pipelines and further method development in the field. HELLO is available at https://github.com/anands-repo/hello 
    more » « less
  5. Abstract Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the orderErysipelotrichales, and is the only binner that can recover the genome of one important speciesBacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids. 
    more » « less