- Award ID(s):
- 1922658
- NSF-PAR ID:
- 10437658
- Date Published:
- Journal Name:
- Science
- Volume:
- 378
- Issue:
- 6626
- ISSN:
- 0036-8075
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Conserved transcription factors termed “terminal selectors” regulate neuronal sub-type specification and differentiation through combinatorial transcriptional regulation of terminal differentiation genes. The unique combinations of terminal differentiation gene products in turn contribute to the functional identities of each neuron. One well-characterized terminal selector is COE (Collier/Olf/Ebf), which has been shown to activate cholinergic gene batteries in C. elegans motor neurons. However, its functions in other metazoans, particularly chordates, is less clear. Here we show that the sole COE ortholog in the non-vertebrate chordate Ciona robusta , Ebf, controls the expression of the cholinergic locus VAChT/ChAT in a single dorsal interneuron of the larval Motor Ganglion, which is presumed to be homologous to the vertebrate spinal cord. We propose that, while the function of Ebf as a regulator of cholinergic neuron identity conserved across bilaterians, its exact role may have diverged in different cholinergic neuron subtypes (e.g., interneurons vs. motor neurons) in chordate-specific motor circuits.more » « less
-
Inferring gene regulatory networks (GRNs) from single-cell RNA-seq (scRNA-seq) data is an important computational question to find regulatory mechanisms involved in fundamental cellular processes. Although many computational methods have been designed to predict GRNs from scRNA-seq data, they usually have high false positive rates and none infer GRNs by directly using the paired datasets of case-versus-control experiments. Here we present a novel deep-learning-based method, named scTIGER, for GRN detection by using the co-differential relationships of gene expression profiles in paired scRNA-seq datasets. scTIGER employs cell-type-based pseudotiming, an attention-based convolutional neural network method and permutation-based significance testing for inferring GRNs among gene modules. As state-of-the-art applications, we first applied scTIGER to scRNA-seq datasets of prostate cancer cells, and successfully identified the dynamic regulatory networks of AR, ERG, PTEN and ATF3 for same-cell type between prostatic cancerous and normal conditions, and two-cell types within the prostatic cancerous environment. We then applied scTIGER to scRNA-seq data from neurons with and without fear memory and detected specific regulatory networks for BDNF, CREB1 and MAPK4. Additionally, scTIGER demonstrates robustness against high levels of dropout noise in scRNA-seq data.
-
Abstract Background Determining cell identity in volumetric images of tagged neuronal nuclei is an ongoing challenge in contemporary neuroscience. Frequently, cell identity is determined by aligning and matching tags to an “atlas” of labeled neuronal positions and other identifying characteristics. Previous analyses of such C. elegans datasets have been hampered by the limited accuracy of such atlases, especially for neurons present in the ventral nerve cord, and also by time-consuming manual elements of the alignment process. Results We present a novel automated alignment method for sparse and incomplete point clouds of the sort resulting from typical C. elegans fluorescence microscopy datasets. This method involves a tunable learning parameter and a kernel that enforces biologically realistic deformation. We also present a pipeline for creating alignment atlases from datasets of the recently developed NeuroPAL transgene. In combination, these advances allow us to label neurons in volumetric images with confidence much higher than previous methods. Conclusions We release, to the best of our knowledge, the most complete full-body C. elegans 3D positional neuron atlas, incorporating positional variability derived from at least 7 animals per neuron, for the purposes of cell-type identity prediction for myriad applications (e.g., imaging neuronal activity, gene expression, and cell-fate).more » « less
-
INTRODUCTION Diverse phenotypes, including large brains relative to body size, group living, and vocal learning ability, have evolved multiple times throughout mammalian history. These shared phenotypes may have arisen repeatedly by means of common mechanisms discernible through genome comparisons. RATIONALE Protein-coding sequence differences have failed to fully explain the evolution of multiple mammalian phenotypes. This suggests that these phenotypes have evolved at least in part through changes in gene expression, meaning that their differences across species may be caused by differences in genome sequence at enhancer regions that control gene expression in specific tissues and cell types. Yet the enhancers involved in phenotype evolution are largely unknown. Sequence conservation–based approaches for identifying such enhancers are limited because enhancer activity can be conserved even when the individual nucleotides within the sequence are poorly conserved. This is due to an overwhelming number of cases where nucleotides turn over at a high rate, but a similar combination of transcription factor binding sites and other sequence features can be maintained across millions of years of evolution, allowing the function of the enhancer to be conserved in a particular cell type or tissue. Experimentally measuring the function of orthologous enhancers across dozens of species is currently infeasible, but new machine learning methods make it possible to make reliable sequence-based predictions of enhancer function across species in specific tissues and cell types. RESULTS To overcome the limits of studying individual nucleotides, we developed the Tissue-Aware Conservation Inference Toolkit (TACIT). Rather than measuring the extent to which individual nucleotides are conserved across a region, TACIT uses machine learning to test whether the function of a given part of the genome is likely to be conserved. More specifically, convolutional neural networks learn the tissue- or cell type–specific regulatory code connecting genome sequence to enhancer activity using candidate enhancers identified from only a few species. This approach allows us to accurately associate differences between species in tissue or cell type–specific enhancer activity with genome sequence differences at enhancer orthologs. We then connect these predictions of enhancer function to phenotypes across hundreds of mammals in a way that accounts for species’ phylogenetic relatedness. We applied TACIT to identify candidate enhancers from motor cortex and parvalbumin neuron open chromatin data that are associated with brain size relative to body size, solitary living, and vocal learning across 222 mammals. Our results include the identification of multiple candidate enhancers associated with brain size relative to body size, several of which are located in linear or three-dimensional proximity to genes whose protein-coding mutations have been implicated in microcephaly or macrocephaly in humans. We also identified candidate enhancers associated with the evolution of solitary living near a gene implicated in separation anxiety and other enhancers associated with the evolution of vocal learning ability. We obtained distinct results for bulk motor cortex and parvalbumin neurons, demonstrating the value in applying TACIT to both bulk tissue and specific minority cell type populations. To facilitate future analyses of our results and applications of TACIT, we released predicted enhancer activity of >400,000 candidate enhancers in each of 222 mammals and their associations with the phenotypes we investigated. CONCLUSION TACIT leverages predicted enhancer activity conservation rather than nucleotide-level conservation to connect genetic sequence differences between species to phenotypes across large numbers of mammals. TACIT can be applied to any phenotype with enhancer activity data available from at least a few species in a relevant tissue or cell type and a whole-genome alignment available across dozens of species with substantial phenotypic variation. Although we developed TACIT for transcriptional enhancers, it could also be applied to genomic regions involved in other components of gene regulation, such as promoters and splicing enhancers and silencers. As the number of sequenced genomes grows, machine learning approaches such as TACIT have the potential to help make sense of how conservation of, or changes in, subtle genome patterns can help explain phenotype evolution. Tissue-Aware Conservation Inference Toolkit (TACIT) associates genetic differences between species with phenotypes. TACIT works by generating open chromatin data from a few species in a tissue related to a phenotype, using the sequences underlying open and closed chromatin regions to train a machine learning model for predicting tissue-specific open chromatin and associating open chromatin predictions across dozens of mammals with the phenotype. [Species silhouettes are from PhyloPic]more » « less
-
null (Ed.)Plants maintain populations of pluripotent stem cells in shoot apical meristems (SAMs), which continuously produce new aboveground organs. We used single-cell RNA sequencing (scRNA-seq) to achieve an unbiased characterization of the transcriptional landscape of the maize shoot stem-cell niche and its differentiating cellular descendants. Stem cells housed in the SAM tip are engaged in genome integrity maintenance and exhibit a low rate of cell division, consistent with their contributions to germline and somatic cell fates. Surprisingly, we find no evidence for a canonical stem-cell organizing center subtending these cells. In addition, trajectory inference was used to trace the gene expression changes that accompany cell differentiation, revealing that ectopic expression of KNOTTED1 ( KN1 ) accelerates cell differentiation and promotes development of the sheathing maize leaf base. These single-cell transcriptomic analyses of the shoot apex yield insight into the processes of stem-cell function and cell-fate acquisition in the maize seedling and provide a valuable scaffold on which to better dissect the genetic control of plant shoot morphogenesis at the cellular level.more » « less