skip to main content


Title: A Reference Genome Assembly of Simmental Cattle, Bos taurus taurus
Abstract Genomics research has relied principally on the establishment and curation of a reference genome for the species. However, it is increasingly recognized that a single reference genome cannot fully describe the extent of genetic variation within many widely distributed species. Pangenome representations are based on high-quality genome assemblies of multiple individuals and intended to represent the broadest possible diversity within a species. A Bovine Pangenome Consortium (BPC) has recently been established to begin assembling genomes from more than 600 recognized breeds of cattle, together with other related species to provide information on ancestral alleles and haplotypes. Previously reported de novo genome assemblies for Angus, Brahman, Hereford, and Highland breeds of cattle are part of the initial BPC effort. The present report describes a complete single haplotype assembly at chromosome-scale for a fullblood Simmental cow from an F1 bison–cattle hybrid fetus by trio binning. Simmental cattle, also known as Fleckvieh due to their red and white spots, originated in central Europe in the 1830s as a triple-purpose breed selected for draught, meat, and dairy production. There are over 50 million Simmental cattle in the world, known today for their fast growth and beef yields. This assembly (ARS_Simm1.0) is similar in length to the other bovine assemblies at 2.86 Gb, with a scaffold N50 of 102 Mb (max scaffold 156.8 Mb) and meets or exceeds the continuity of the best Bos taurus reference assemblies to date.  more » « less
Award ID(s):
1754451
NSF-PAR ID:
10232209
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Editor(s):
Koepfli, Klaus-Peter
Date Published:
Journal Name:
Journal of Heredity
Volume:
112
Issue:
2
ISSN:
0022-1503
Page Range / eLocation ID:
184 to 191
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Previous studies of canid population and evolutionary genetics have relied on high-quality domestic dog reference genomes that have been produced primarily for biomedical and trait mapping studies in dog breeds. However, the absence of highly contiguous genomes from other Canis species like the gray wolf and coyote, that represent additional distinct demographic histories, may bias inferences regarding interspecific genetic diversity and phylogenetic relationships. Here, we present single haplotype de novo genome assemblies for the gray wolf and coyote, generated by applying the trio-binning approach to long sequence reads generated from the genome of a female first-generation hybrid produced from a gray wolf and coyote mating. The assemblies were highly contiguous, with contig N50 sizes of 44.6 and 42.0 Mb for the wolf and coyote, respectively. Genome scaffolding and alignments between the two Canis assemblies and published dog reference genomes showed near complete collinearity, with one exception: a coyote-specific chromosome fission of chromosome 13 and fusion of the proximal portion of that chromosome with chromosome 8, retaining the Canis-typical haploid chromosome number of 2n = 78. We evaluated mapping quality for previous RADseq data from 334 canids and found nearly identical mapping quality and patterns among canid species and regional populations regardless of the genome used for alignment (dog, coyote, or gray wolf). These novel wolf and coyote genome reference assemblies will be important resources for proper and accurate inference of Canis demography, taxonomic evaluation, and conservation genetics.

     
    more » « less
  2. Abstract

    Despite being quite specious (~10,000 extant species), birds have a fairly uniform genome size and karyotype (including the common occurrence of microchromosomes) relative to other vertebrate lineages. Storks (Family Ciconiidae) are a charismatic and distinct group of large wading birds with nearly worldwide distribution but few genomic resources. Here we present an annotated chromosome-level reference genome and chromosome orthology analysis for the wood stork (Mycteria americana), a species that has been federally protected under the Endangered Species Act since 1984. The annotated chromosome-level reference assembly was produced using the blood of a wild female wood stork chick, has a length of 1.35 Gb, a contig N50 of 37 Mb, a scaffold N50 of 80 Mb, and a BUSCO score of 98.8%. We identified 31 autosomal pairs and two sex chromosomes in the wood stork genome, but failed to identify four additional autosomal microchromosomes previously found via karyotyping. Orthology analyses confirmed reported synapomorphies unique to storks and identified the chromosomes participating in these fusions. This study highlights the difficulty and potential problems associated with delineating microchromosomes in reference genome assemblies. It also provides a foundation for studying karyotype evolution in the core water bird clade that includes penguins, albatrosses, storks, cormorants, herons, and ibises. Finally, our reference genome will allow for numerous genomic studies, such as genome-wide association studies of local adaptation, that will aid in wood stork conservation.

     
    more » « less
  3. Abstract

    The number of reference genomes of snakes lags behind several other vertebrate groups (e.g. birds and mammals). However, in the last two years, a concerted effort by researchers from around the world has produced new genomes of snakes representing members from several new families. Here, we present a high-quality, annotated genome of the central ratsnake (Pantherophis alleghaniensis), a member of the most diverse snake lineage, Colubroidea. Pantherophis alleghaniensis is found in the central part of the Nearctic, east of the Mississippi River. This genome was sequenced using 10X Chromium synthetic long reads and polished using Illumina short reads. The final genome assembly had an N50 of 21.82 Mb and an L50 of 22 scaffolds with a maximum scaffold length of 82.078 Mb. The genome is composed of 49.24% repeat elements dominated by long interspersed elements. We annotated this genome using transcriptome assemblies from 14 tissue types and recovered 28,368 predicted proteins. Finally, we estimated admixture proportions between two species of ratsnakes and discovered that this specimen is an admixed individual containing genomes from the western (Pantherophis obsoletus) and central ratsnakes (P. alleghaniensis). We discuss the importance of considering interspecific admixture in downstream approaches for inferring demography and phylogeny.

     
    more » « less
  4. null (Ed.)
    The blue crab, Callinectes sapidus (Rathbun, 1896) is an economically, culturally, and ecologically important species found across the temperate and tropical North and South American Atlantic coast. A reference genome will enable research for this high-value species. Initial assembly combined 200× coverage Illumina paired-end reads, a 60× 8 kb mate-paired library, and 50× PacBio data using the MaSuRCA assembler resulting in a 985 Mb assembly with a scaffold N50 of 153 kb. Dovetail Chicago and HiC sequencing with the 3d DNA assembler and Juicebox assembly tools were then used for chromosome scaffolding. The 50 largest scaffolds span 810 Mb are 1.5–37 Mb long and have a repeat content of 36%. The 190 Mb unplaced sequence is in 3921 sequences over 10 kb with a repeat content of 68%. The final assembly N50 is 18.9 Mb for scaffolds and 9317 bases for contigs. Of arthropod BUSCO, ∼88% (888/1013) were complete and single copies. Using 309 million RNAseq read pairs from 12 different tissues and developmental stages, 25,249 protein-coding genes were predicted. Between C. sapidus and Portunus trituberculatus genomes, 41 of 50 large scaffolds had high nucleotide identity and protein-coding synteny, but 9 scaffolds in both assemblies were not clear matches. The protein-coding genes included 9423 one-to-one putative orthologs, of which 7165 were syntenic between the two crab species. Overall, the two crab genome assemblies show strong similarities at the nucleotide, protein, and chromosome level and verify the blue crab genome as an excellent reference for this important seafood species. 
    more » « less
  5. Sethuraman, Arun (Ed.)
    Abstract

    Damselflies and dragonflies (Order: Odonata) play important roles in both aquatic and terrestrial food webs and can serve as sentinels of ecosystem health and predictors of population trends in other taxa. The habitat requirements and limited dispersal of lotic damselflies make them especially sensitive to habitat loss and fragmentation. As such, landscape genomic studies of these taxa can help focus conservation efforts on watersheds with high levels of genetic diversity, local adaptation, and even cryptic endemism. Here, as part of the California Conservation Genomics Project (CCGP), we report the first reference genome for the American rubyspot damselfly, Hetaerina americana, a species associated with springs, streams and rivers throughout California. Following the CCGP assembly pipeline, we produced two de novo genome assemblies. The primary assembly includes 1,630,044,487 base pairs, with a contig N50 of 5.4 Mb, a scaffold N50 of 86.2 Mb, and a BUSCO completeness score of 97.6%. This is the seventh Odonata genome to be made publicly available and the first for the subfamily Hetaerininae. This reference genome fills an important phylogenetic gap in our understanding of Odonata genome evolution, and provides a genomic resource for a host of interesting ecological, evolutionary, and conservation questions for which the rubyspot damselfly genus Hetaerina is an important model system.

     
    more » « less