skip to main content


Title: Comparative genomics within and across Bilaterians illuminates the evolutionary history of ALK and LTK proto-oncogene origination and diversification
Comparative genomic analyses have enormous potential for identifying key genes central to human health phenotypes, including those that promote cancers. In particular, the successful development of novel therapeutics using model species requires phylogenetic analyses to determine molecular homology. Accordingly, we investigate the evolutionary histories of anaplastic lymphoma kinase (ALK)—which can underlie tumorigenesis in neuroblastoma, non-small cell lung cancer, and anaplastic large-cell lymphoma—its close relative leukocyte tyrosine kinase (LTK) and their candidate ligands. Homology of ligands identified in model organisms to those functioning in humans remains unclear. Therefore, we searched for homologs of the human genes across metazoan genomes, finding that the candidate ligands Jeb and Hen-1 were restricted to non-vertebrate species. In contrast, the ligand AUG was only identified in vertebrates. We found two ALK-like and four AUG-like protein-coding genes in lamprey. Of these six genes, only one ALK-like and two AUG-like genes exhibited early embryonic expression that parallels model mammal systems. Two copies of AUG are present in nearly all jawed vertebrates. Our phylogenetic analysis strongly supports the presence of previously unrecognized functional convergences of ALK and LTK between actinopterygians and sarcopterygians—despite contemporaneous, highly conserved synteny of ALK and LTK. These findings provide critical guidance regarding the propriety of fish and mammal models with regard to model-organism-based investigation of these medically important genes. In sum, our results provide the phylogenetic context necessary for effective investigations of the functional roles and biology of these critically important receptors.  more » « less
Award ID(s):
1934860 1755242
NSF-PAR ID:
10204140
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Hershberg, Ruth
Date Published:
Journal Name:
Genome Biology and Evolution
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION Thousands of genetic variants have been associated with human diseases and traits through genome-wide association studies (GWASs). Translating these discoveries into improved therapeutics requires discerning which variants among hundreds of candidates are causally related to disease risk. To date, only a handful of causal variants have been confirmed. Here, we leverage 100 million years of mammalian evolution to address this major challenge. RATIONALE We compared genomes from hundreds of mammals and identified bases with unusually few variants (evolutionarily constrained). Constraint is a measure of functional importance that is agnostic to cell type or developmental stage. It can be applied to investigate any heritable disease or trait and is complementary to resources using cell type– and time point–specific functional assays like Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTEx). RESULTS Using constraint calculated across placental mammals, 3.3% of bases in the human genome are significantly constrained, including 57.6% of coding bases. Most constrained bases (80.7%) are noncoding. Common variants (allele frequency ≥ 5%) and low-frequency variants (0.5% ≤ allele frequency < 5%) are depleted for constrained bases (1.85 versus 3.26% expected by chance, P < 2.2 × 10 −308 ). Pathogenic ClinVar variants are more constrained than benign variants ( P < 2.2 × 10 −16 ). The most constrained common variants are more enriched for disease single-nucleotide polymorphism (SNP)–heritability in 63 independent GWASs. The enrichment of SNP-heritability in constrained regions is greater (7.8-fold) than previously reported in mammals and is even higher in primates (11.1-fold). It exceeds the enrichment of SNP-heritability in nonsynonymous coding variants (7.2-fold) and fine-mapped expression quantitative trait loci (eQTL)–SNPs (4.8-fold). The enrichment peaks near constrained bases, with a log-linear decrease of SNP-heritability enrichment as a function of the distance to a constrained base. Zoonomia constraint scores improve functionally informed fine-mapping. Variants at sites constrained in mammals and primates have greater posterior inclusion probabilities and higher per-SNP contributions. In addition, using both constraint and functional annotations improves polygenic risk score accuracy across a range of traits. Finally, incorporating constraint information into the analysis of noncoding somatic variants in medulloblastomas identifies new candidate driver genes. CONCLUSION Genome-wide measures of evolutionary constraint can help discern which variants are functionally important. This information may accelerate the translation of genomic discoveries into the biological, clinical, and therapeutic knowledge that is required to understand and treat human disease. Using evolutionary constraint in genomic studies of human diseases. ( A ) Constraint was calculated across 240 mammal species, including 43 primates (teal line). ( B ) Pathogenic ClinVar variants ( N = 73,885) are more constrained across mammals than benign variants ( N = 231,642; P < 2.2 × 10 −16 ). ( C ) More-constrained bases are more enriched for trait-associated variants (63 GWASs). ( D ) Enrichment of heritability is higher in constrained regions than in functional annotations (left), even in a joint model with 106 annotations (right). ( E ) Fine-mapping (PolyFun) using a model that includes constraint scores identifies an experimentally validated association at rs1421085. Error bars represent 95% confidence intervals. BMI, body mass index; LF, low frequency; PIP, posterior inclusion probability. 
    more » « less
  2. Resolving the phylogenetic relationships among Paleocene mammals has been a longstanding goal in paleontology. Constructing an accurate and comprehensive phylogeny for Paleocene mammals is a worthwhile objective in itself, but it also provides a framework on which we can better understand the origin of placental mammals and the evolutionary processes underlying the diversification of mammals before, during, and after the end-Cretaceous mass extinction. More recently, a robust Palaeocene mammal phylogeny has become a much-coveted tool for reconciling discrepancies between morphological and molecular evidence for the phylogeny and diversification of Placentalia. Here, we present a novel phylogenetic dataset to test hypotheses regarding Paleocene mammal phylogeny and the origin and diversification of Placentalia. To date, our matrix combines phenomic data for 36 extant mammal species and 107 fossil species scored for 2540 morphological characters alongside 26 genes sequenced for 47 species. We utilized a reductive morphological scoring strategy in order to minimize assumptions and test hypotheses on homology. Multiple sequence alignments were performed in MEGA-X for each gene. We then analysed the data using Bayesian methods and explored the effects of different approaches. Relaxed clock analyses using a molecular constraint and an FBD prior are congruent with the diversification of many extant orders prior to the K-Pg boundary. Relaxed clocked total-evidence analyses (morphology and molecules) using an FBD prior resulted in older ages of diversification than those estimated by our relaxed clock molecular constraint model and previous molecular studies. Within Placentalia, our phylogenies provide support for the divergence of Atlantogenata (Afrotheria and Xenarthra) from Boreoeutheria (Euarchontoglires and Laurasiatheria). Among the Paleocene taxa, ‘condylarths’ are distributed along the base of Laurasiatheria with members of ‘Arctocyonidae’ recovered as sister taxa to Artiodactyla; enigmatic groups such as Pantodonta and Taeniodonta are recovered as crown placentals whereas Leptictida is not. Our Paleocene mammal phylogeny is a critical step toward better understanding placental mammal evolution. Ultimately, this work will facilitate the investigation of fundamental questions previously encumbered by the lack of a well-resolved phylogeny. 
    more » « less
  3. Finding genes biologically directly or indirectly related to lung cancer has been drawing much attention, and many genes directly related to lung cancer have been reported. However, it has not been confirmed whether those published 'key' genes are truly critical to lung cancer formation, i.e., they may be with very limited useful information. As a result, finding essential genes remains a challenging lung cancer research problem. Using a recently developed competing linear factor analysis method in differentially expressed gene detection, we advance the study of lung cancer critical genes detection to a uniformly informative level. A set of common four genes and their functional effects are detected to be differentially expressed in tumor and non- tumor samples with 100% sensitivity and 100% specificity in one study of lung adenocarcinoma (LUAD) and one study of squamous cell lung cancers (LUSC) (two North American cohorts with 20429 genes, 576 and 552 samples respectively). Two additional analyses also gain accuracy of 97.8% sensitivity and 100% specificity in one study of non-small cell lung carcinomas (NSCLC, a European cohort with 20356 genes and 156 samples), and an accuracy of 100% sensitivity and 95% specificity (1 out of 20 non-tumor samples) in one study of ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas (LUAD, a Japanese cohort with 20356 genes and 224 samples). There are some common genes, but different functional effects, within each set of four genes among two North American cohorts and a European cohort and among North American cohorts and the Japanese cohort. These results show the four-gene-based classifiers are robust with different types of lung cancers and different race cohorts and accurate. The functional effects of four genes disclose significantly other mechanisms (mysteries) between LUAD and LUSC. These sets of four genes and their functional effects are considered to be essential for lung cancer studies and practice. These genes' functional effects naturally classify patients into different groups (more than seven subtypes). Subtype information is useful for personalized therapies. The new findings can motivate new lung cancer research in more focused and targeted directions to save lives, protect people, and reduce enormous economic costs in research and lung cancer treatments. 
    more » « less
  4. 2938 Using a Human Liver Tissue Equivalent (hLTE) Platform to Define the Functional Impact of Liver-Directed AAV Gene Therapy 63rd ASH Annual Meeting and Exposition, December 11-14, 2021, Georgia World Congress Center, Atlanta, GA Program: Oral and Poster Abstracts Session: 801. Gene Therapies: Poster II Hematology Disease Topics & Pathways: Bleeding and Clotting, Biological, Translational Research, Hemophilia, Genetic Disorders, Clinically Relevant, Diseases, Gene Therapy, Therapies Sunday, December 12, 2021, 6:00 PM-8:00 PM Ritu M Ramamurthy1*, Wen Ting Zheng2*, Sunil George, PhD1*, Meimei Wan1*, Yu Zhou, PhD1*, Baisong Lu, PhD1*, Colin E Bishop, PhD1*, Anthony Atala, M.D.1*, Christopher D Porada, PhD1* and M. Graca Almeida-Porada, MD3 1Fetal Research and Therapy Program, Wake Forest Institute for Regenerative Medicine, Winston Salem, NC 2Massachusetts Institute of Technology, Cambridge, MA 3Fetal Research and Therapy Program, Wake Forest Institute For Regenerative Medicine, Winston-Salem, NC Clinical trials employing AAV vectors for hemophilia A have been hindered by unanticipated immunological and/or inflammatory responses in some of the patients. Also, these trials have often yielded lower levels of transgene expression than were expected based upon preclinical studies, highlighting the poor correlation between the transduction efficiency observed in traditional 2D cultures of primary cells in vitro, and that observed in those same cell types in vivo. It has been also recognized that there are marked species-specific differences in AAV-vector tropism, raising the critical question of the accuracy with which various animal models will likely predict tropism/vector transduction efficiency, and eventual treatment success in humans. Human liver tissue equivalents (hLTEs) are comprised of major cell types in the liver in physiologically relevant frequencies and possess the ability to recapitulate the biology and function of native human liver. Here, we hypothesize that hLTEs can be used as a better model to predict the efficacy and safety of AAV gene therapy in humans. We fabricated hLTEs using 75% hepatocytes, 10% stellate cells, 10% Kupffer cells, and 5% liver sinusoid-derived endothelial cells in 96-well Elplasia plates with 79 microwells per well. hLTEs were transduced at an MOI of 105vg/cell, on the day of fabrication, with the clinically relevant serotypes AAV5 (hLTE-5) or AAV3b (hLTE-3b), both encoding a GFP reporter. After 4 days of self-aggregation, live/dead assay was performed to confirm viability. Non-transduced hLTEs served as negative controls (hLTE(-)), and hLTEs exposed to 20 mM acetaminophen were used as positive controls for liver inflammation/damage. Incucyte® Live-Cell Imaging system was used to track the aggregation and GFP expression of hLTEs. Over the course of the next 5 days, media was collected to determine hepatic functionality, RNA was isolated to assess dysregulation of genes involved in inflammation and fibrosis, DNA was isolated to determine whether AAV vectors integrate into the genome of human hepatocytes and, if so, to define the frequency at which this occurs and the genomic loci of integration, and hLTEs were fixed and processed at appropriate times for histological analyses and transmission electron microscopy (TEM). TEM analysis revealed that all groups exhibited microvilli and bile-canaliculus-like structures, demonstrating the formation of a rudimentary biliary system and, more importantly, proving that hLTEs resemble native liver structure. Incucyte® imaging showed that AAV5 and AAV3b transduction impaired formation of hLTEs (57.57 ± 2.42 and 24.57 ± 4.01 spheroids/well, respectively) in comparison with hLTE(-) (74.86 ± 3.8 spheroids/well). Quantification of GFP expression demonstrated that AAV5 yielded the most efficient transduction of hLTEs (fold change in GFP expression compared to control: 2.73 ± 0.09 and 1.19 ± 0.03 for hLTE-5 and hLTE-3b, respectively). Chromogenic assays showed decreased urea production in cell culture supernatants of AAV transduced groups compared to the non-transduced hLTEs on days 6 and 10 of culture, demonstrating decreased hepatocyte functionality. However, ALT and AST levels were similar in all groups. On day 10, hLTEs were either used for RNA isolation or fixed in 4% PFA and processed for histology. Masson’s Trichrome and Alcian Blue/Sirius Red staining was performed to detect fibrosis, which was then quantified using ImageJ. These analyses showed no significant increase in fibrosis in either hLTE-5 or hLTE-3b compared to hLTE(-). Nevertheless, RT2 PCR Array for Human Fibrosis detected dysregulation of several genes involved in fibrosis/inflammation in both hLTE-5 and hLTE-3b (16/84 and 26/84, respectively). In conclusion, data collected thus far show successful recapitulation of native liver biology and demonstrate that AAV5 transduces hLTEs more efficiently than AAV3b. However, impaired self-aggregation and decreased hepatocyte functionality was observed in both AAV-transduced groups. Studies to address the incidence and location(s) of AAV integration are ongoing. We have thus shown that the hLTE system can provide critical new knowledge regarding the efficacy and safety of AAV gene therapy in the human liver. Disclosures: No relevant conflicts of interest to declare. 
    more » « less
  5. Abstract The fms-related tyrosine kinase 3 (Flt3) and its ligand (Flt3lg) are important regulators of hematopoiesis and dendritic cell (DC) homeostasis with unsettled coevolution. Gene synteny and deduced amino acid sequence analyses identified conserved flt3 gene orthologs across all jawed vertebrates. In contrast, flt3lg orthologs were not retrieved in ray-finned fish, and the gene locus exhibited more variability among species. Interestingly, duplicated flt3/flt3lg genes were maintained in the allotetraploid Xenopus laevis. Comparison of modeled structures of X. laevis Flt3 and Flt3lg homoeologs with the related diploid Xenopus tropicalis and with humans indicated a higher conformational divergence between the homoeologous pairs than their respective counterparts. The distinctive developmental and tissue expression patterns of Flt3 and Flt3lg homoeologs in tadpoles and adult frogs suggest a subfunctionalization of these homoeologs. To characterize Flt3 cell surface expression, X. laevis–tagged rFlt3lg.S and rFlt3lg.L were produced. Both rFlt3lg.S and rFlt3lg.L bind in vitro Flt3.S and Flt3.L and can trigger Erk1/2 signaling, which is consistent with a partial overlapping function between homoeologs. In spleen, Flt3.S/L cell surface expression was detected on a fraction of B cells and a population of MHC class IIhigh/CD8+ leukocytes phenotypically similar to the recently described dual follicular/conventional DC-like XL cells. Our result suggests that 1) Flt3lg.S and Flt3lg.L are both involved in XL cell homeostasis and that 2) XL cells have hematopoietic origin. Furthermore, we detected surface expression of the macrophage/monocyte marker Csf1r.S on XL cells as in mammalian and chicken DCs, which points to a common evolutionary origin in vertebrate DCs. 
    more » « less