skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Chromosome‐scale scaffolds for the Chinese hamster reference genome assembly to facilitate the study of the CHO epigenome
Abstract The Chinese hamster genome serves as a reference genome for the study of Chinese hamster ovary (CHO) cells, the preferred host system for biopharmaceutical production. Recent re‐sequencing of the Chinese hamster genome resulted in the RefSeq PICR meta‐assembly, a set of highly accurate scaffolds that filled over 95% of the gaps in previous assembly versions. However, these scaffolds did not reach chromosome‐scale due to the absence of long‐range scaffolding information during the meta‐assembly process. Here, long‐range scaffolding of the PICR Chinese hamster genome assembly was performed using high‐throughput chromosome conformation capture (Hi‐C). This process resulted in a new “PICRH” genome, where 97% of the genome is contained in 11 mega‐scaffolds corresponding to the Chinese hamster chromosomes (2n = 22) and the total number of scaffolds is reduced by three‐fold from 1,830 scaffolds in PICR to 647 in PICRH. Continuity was improved while preserving accuracy, leading to quality scores higher than recent builds of mouse chromosomes and comparable to human chromosomes. The PICRH genome assembly will be an indispensable tool for designing advanced genetic engineering strategies in CHO cells and enabling systematic examination of genomic and epigenomic instability through comparative analysis of CHO cell lines on a common set of chromosomal coordinates.  more » « less
Award ID(s):
1736123
PAR ID:
10458033
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Biotechnology and Bioengineering
Volume:
117
Issue:
8
ISSN:
0006-3592
Format(s):
Medium: X Size: p. 2331-2339
Size(s):
p. 2331-2339
Sponsoring Org:
National Science Foundation
More Like this
  1. Mank, Judith (Ed.)
    Abstract Urosaurus nigricaudus is a phrynosomatid lizard endemic to the Baja California Peninsula in Mexico. This work presents a chromosome-level genome assembly and annotation from a male individual. We used PacBio long reads and HiRise scaffolding to generate a high-quality genomic assembly of 1.87 Gb distributed in 327 scaffolds, with an N50 of 279 Mb and an L50 of 3. Approximately 98.4% of the genome is contained in 14 scaffolds, with 6 large scaffolds (334–127 Mb) representing macrochromosomes and 8 small scaffolds (63–22 Mb) representing microchromosomes. Using standard gene modeling and transcriptomic data, we predicted 17,902 protein-coding genes on the genome. The repeat content is characterized by a large proportion of long interspersed nuclear elements that are relatively old. Synteny analysis revealed some microchromosomes with high repeat content are more prone to rearrangements but that both macro- and microchromosomes are well conserved across reptiles. We identified scaffold 14 as the X chromosome. This microchromosome presents perfect dosage compensation where the single X of males has the same expression levels as two X chromosomes in females. Finally, we estimated the effective population size for U. nigricaudus was extremely low, which may reflect a reduction in polymorphism related to it becoming a peninsular endemic. 
    more » « less
  2. Abstract Genomic resources across squamate reptiles (lizards and snakes) have lagged behind other vertebrate systems and high-quality reference genomes remain scarce. Of the 23 chromosome-scale reference genomes across the order, only 12 of the ~60 squamate families are represented. Within geckos (infraorder Gekkota), a species-rich clade of lizards, chromosome-level genomes are exceptionally sparse representing only two of the seven extant families. Using the latest advances in genome sequencing and assembly methods, we generated one of the highest-quality squamate genomes to date for the leopard gecko, Eublepharis macularius (Eublepharidae). We compared this assembly to the previous, short-read only, E. macularius reference genome published in 2016 and examined potential factors within the assembly influencing contiguity of genome assemblies using PacBio HiFi data. Briefly, the read N50 of the PacBio HiFi reads generated for this study was equal to the contig N50 of the previous E. macularius reference genome at 20.4 kilobases. The HiFi reads were assembled into a total of 132 contigs, which was further scaffolded using HiC data into 75 total sequences representing all 19 chromosomes. We identified 9 of the 19 chromosomal scaffolds were assembled as a near-single contig, whereas the other 10 chromosomes were each scaffolded together from multiple contigs. We qualitatively identified that the percent repeat content within a chromosome broadly affects its assembly contiguity prior to scaffolding. This genome assembly signifies a new age for squamate genomics where high-quality reference genomes rivaling some of the best vertebrate genome assemblies can be generated for a fraction of previous cost estimates. This new E. macularius reference assembly is available on NCBI at JAOPLA010000000. 
    more » « less
  3. Abstract The ambr250 high-throughput bioreactor platform was adopted to provide a highly-controlled environment for a project investigating genome instability in Chinese hamster ovary (CHO) cells, where genome instability leads to lower protein productivity. Development of the baseline (control) and stressed process conditions highlighted the need to control critical process parameters, including the proportional, integral, and derivative (PID) control loops. Process parameters that are often considered scale-independent, include dissolved oxygen (DO) and pH; however, these parameters were observed to be sensitive to PID settings. For many bioreactors, control loops are cascaded such that the manipulated variables are adjusted concurrently. Conversely, for the ambr250 bioreactor system, the control levels are segmented and implemented sequentially. Consequently, each control level must be tuned independently, as the PID settings are independent by control level. For the CHO cell studies, it was observed that initial PID settings did not resulted in a robust process, which was observed as elevated lactate levels; which was caused by the pH being above the setpoint most of the experiment. After several PID tuning iterations, new PID settings were found that could respond appropriately to routine feed and antifoam additions. Furthermore, these new PID settings resulted in more robust cell growth and increased protein productivity. This work highlights the need to describe PID gains and manipulated variable ranges, as profoundly different outcomes can result from the same feeding protocol. Additionally, improved process models are needed to allow process simulations and tuning. Thus, these tuning experiments support the idea that PID settings should be fully described in bioreactor publications to allow for better reproducibility of results. 
    more » « less
  4. Complete, accurate genome assemblies are necessary to design targets for genetic engineering strategies. Successful gene knockdowns and knockouts in Chinese hamster ovary (CHO) cells may prevent the expression of difficult‐to‐remove host cell proteins (HCPs). HCPs, if not removed, can cause problems in stability, safety, and efficacy of the biotherapeutic. A significantly improved Chinese hamster (CH) reference genome was used to identify new knockout targets with similar predicted functions and characteristics as the difficult‐to‐remove host cell lipases, LPL, PLBL2, and LPLA2. The CHO‐K1 gene and protein sequences of several of these lipases were corrected using the updated CH genome. Sequence alignments were then used to identify conserved regions that may serve as possible targets for multiple simultaneous gene knockouts. Finally, the comparison of the CHO‐K1 lipase protein sequences to their human orthologs provided insight into which lipases, if persistent in the drug product, could possibly cause immunogenic responses in patients. Topical heading: Biomolecular Engineering, Bioengineering, Biochemicals, Biofuels, and Food. © 2018 American Institute of Chemical EngineersAIChE J, 64: 4247–4254, 2018 
    more » « less
  5. Abstract The Chinese hamster ovary (CHO) cell lines that are used to produce commercial quantities of therapeutic proteins commonly exhibit a decrease in productivity over time in culture, a phenomenon termed production instability. Random integration of the transgenes encoding the protein of interest into locations in the CHO genome that are vulnerable to genetic and epigenetic instability often causes production instability through copy number loss and silencing of expression. Several recent publications have shown that these cell line development challenges can be overcome by using site‐specific integration (SSI) technology to insert the transgenes at genomic loci, often called “hotspots,” that are transcriptionally permissive and have enhanced stability relative to the rest of the genome. However, extensive characterization of the CHO epigenome is needed to identify hotspots that maintain their desirable epigenetic properties in an industrial bioprocess environment and maximize transcription from a single integrated transgene copy. To this end, the epigenomes and transcriptomes of two distantly related cell lines, an industrially relevant monoclonal antibody‐producing cell line and its parental CHO‐K1 host, were characterized using high throughput chromosome conformation capture and RNAseq to analyze changes in the epigenome that occur during cell line development and associated changes in system‐wide gene expression. In total, 10.9% of the CHO genome contained transcriptionally permissive three‐dimensional chromatin structures with enhanced genetic and epigenetic stability relative to the rest of the genome. These safe harbor regions also showed good agreement with published CHO epigenome data, demonstrating that this method was suitable for finding genomic regions with epigenetic markers of active and stable gene expression. These regions significantly reduce the genomic search space when looking for CHO hotspots with widespread applicability and can guide future studies with the goal of maximizing the potential of SSI technology in industrial production CHO cell lines. 
    more » « less