Abstract BackgroundRecent studies uncovered pervasive transcription and translation of thousands of noncanonical open reading frames (nORFs) outside of annotated genes. The contribution of nORFs to cellular phenotypes is difficult to infer using conventional approaches because nORFs tend to be short, of recent de novo origins, and lowly expressed. Here we develop a dedicated coexpression analysis framework that accounts for low expression to investigate the transcriptional regulation, evolution, and potential cellular roles of nORFs inSaccharomyces cerevisiae. ResultsOur results reveal that nORFs tend to be preferentially coexpressed with genes involved in cellular transport or homeostasis but rarely with genes involved in RNA processing. Mechanistically, we discover that young de novo nORFs located downstream of conserved genes tend to leverage their neighbors’ promoters through transcription readthrough, resulting in high coexpression and high expression levels. Transcriptional piggybacking also influences the coexpression profiles of young de novo nORFs located upstream of genes, but to a lesser extent and without detectable impact on expression levels. Transcriptional piggybacking influences, but does not determine, the transcription profiles of de novo nORFs emerging nearby genes. About 40% of nORFs are not strongly coexpressed with any gene but are transcriptionally regulated nonetheless and tend to form entirely new transcription modules. We offer a web browser interface (https://carvunislab.csb.pitt.edu/shiny/coexpression/) to efficiently query, visualize, and download our coexpression inferences. ConclusionsOur results suggest that nORF transcription is highly regulated. Our coexpression dataset serves as an unprecedented resource for unraveling how nORFs integrate into cellular networks, contribute to cellular phenotypes, and evolve.
more »
« less
NeST: nested hierarchical structure identification in spatial transcriptomic data
Abstract Spatial gene expression in tissue is characterized by regions in which particular genes are enriched or depleted. Frequently, these regions contain nested inside them subregions with distinct expression patterns. Segmentation methods in spatial transcriptomic (ST) data extract disjoint regions maximizing similarity over the greatest number of genes, typically on a particular spatial scale, thus lacking the ability to find region-within-region structure. We present NeST, which extracts spatial structure through coexpression hotspots—regions exhibiting localized spatial coexpression of some set of genes. Coexpression hotspots identify structure on any spatial scale, over any possible subset of genes, and are highly explainable. NeST also performs spatial analysis of cell-cell interactions via ligand-receptor, identifying active areas de novo without restriction of cell type or other groupings, in both two and three dimensions. Through application on ST datasets of varying type and resolution, we demonstrate the ability of NeST to reveal a new level of biological structure.
more »
« less
- PAR ID:
- 10469546
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 14
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Spatial transcriptomics (ST) technologies enable high throughput gene expression characterization within thin tissue sections. However, comparing spatial observations across sections, samples, and technologies remains challenging. To address this challenge, we develop STalign to align ST datasets in a manner that accounts for partially matched tissue sections and other local non-linear distortions using diffeomorphic metric mapping. We apply STalign to align ST datasets within and across technologies as well as to align ST datasets to a 3D common coordinate framework. We show that STalign achieves high gene expression and cell-type correspondence across matched spatial locations that is significantly improved over landmark-based affine alignments. Applying STalign to align ST datasets of the mouse brain to the 3D common coordinate framework from the Allen Brain Atlas, we highlight how STalign can be used to lift over brain region annotations and enable the interrogation of compositional heterogeneity across anatomical structures. STalign is available as an open-source Python toolkit athttps://github.com/JEFworks-Lab/STalignand as Supplementary Software with additional documentation and tutorials available athttps://jef.works/STalign.more » « less
-
Abstract Spatially resolved gene expression profiling provides insight into tissue organization and cell–cell crosstalk; however, sequencing-based spatial transcriptomics (ST) lacks single-cell resolution. Current ST analysis methods require single-cell RNA sequencing data as a reference for rigorous interpretation of cell states, mostly do not use associated histology images and are not capable of inferring shared neighborhoods across multiple tissues. Here we present Starfysh, a computational toolbox using a deep generative model that incorporates archetypal analysis and any known cell type markers to characterize known or new tissue-specific cell states without a single-cell reference. Starfysh improves the characterization of spatial dynamics in complex tissues using histology images and enables the comparison of niches as spatial hubs across tissues. Integrative analysis of primary estrogen receptor (ER)-positive breast cancer, triple-negative breast cancer (TNBC) and metaplastic breast cancer (MBC) tissues led to the identification of spatial hubs with patient- and disease-specific cell type compositions and revealed metabolic reprogramming shaping immunosuppressive hubs in aggressive MBC.more » « less
-
Seeds, which provide a major source of calories for humans, are a unique stage of a flowering plant’s lifecycle. During seed germination the embryo reactivates rapidly and goes through major developmental transitions to become a seedling. This requires extensive and complex spatiotemporal coordination of cell and tissue activity. Existing gene expression profiling methods, such as laser capture microdissection followed by RNA-seq and single-cell RNA7 seq, suffer from either low throughput or the loss of spatial information about the cells analysed. Spatial transcriptomics methods couple high throughput analysis of gene expression simultaneously with the ability to record the spatial location of each individual region analysed. We developed a spatial transcriptomics workflow for germinating barley grain to better understand the spatiotemporal control of gene expression within individual seed cell types. More than 14,000 genes were differentially regulated across 0, 1, 3, 6 and 24 hours after imbibition. This approach enabled us to observe that many functional categories displayed specific spatial expression patterns that could be resolved at a sub-tissue level. Individual aquaporin gene family members, important for water and ion transport, had specific spatial expression patterns over time, as well as genes related to cell wall modification, membrane transport and transcription factors. Using spatial autocorrelation algorithms, we were able to identify auxin transport genes that had increasingly focused expression within subdomains of the embryo over germination time, suggestive of a role in establishment of the embryo axis. Together, our data provides an unprecedented spatially resolved cellular map for barley grain germination and specific genes to target for functional genomics to define cellular restricted processes in tissues during germination. The data can be viewed at https://spatial.latrobe.edu.au/.more » « less
-
Abstract Spatial transcriptomics (ST) technologies measure gene expression at thousands of locations within a two-dimensional tissue slice, enabling the study of spatial gene expression patterns. Spatial variation in gene expression is characterized byspatial gradients, or the collection of vector fields describing the direction and magnitude in which the expression of each gene increases. However, the few existing methods that learn spatial gradients from ST data either make restrictive and unrealistic assumptions on the structure of the spatial gradients or do not accurately model discrete transcript locations/counts. We introduce SLOPER (for Score-based Learning Of Poisson-modeled Expression Rates), a generative model for learning spatial gradients (vector fields) from ST data. SLOPER models the spatial distribution of mRNA transcripts with aninhomogeneous Poisson point process (IPPP)and usesscore matchingto learn spatial gradients for each gene. SLOPER utilizes the learned spatial gradients in a novel diffusion-based sampling approach to enhance the spatial coherence and specificity of the observed gene expression measurements. We demonstrate that the spatial gradients and enhanced gene expression representations learned by SLOPER leads to more accurate identification of tissue organization, spatially variable gene modules, and continuous axes of spatial variation (isodepth) compared to existing methods. Software availabilitySLOPER is available athttps://github.com/chitra-lab/SLOPER.more » « less
An official website of the United States government
