skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, September 13 until 2:00 AM ET on Saturday, September 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Clifton, Kalen"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Background

    Recent advances in imaging-based spatially resolved transcriptomics (im-SRT) technologies now enable high-throughput profiling of targeted genes and their locations in fixed tissues. Normalization of gene expression data is often needed to account for technical factors that may confound underlying biological signals.

    Results

    Here, we investigate the potential impact of different gene count normalization methods with different targeted gene panels in the analysis and interpretation of im-SRT data. Using different simulated gene panels that overrepresent genes expressed in specific tissue regions or cell types, we demonstrate how normalization methods based on detected gene counts per cell differentially impact normalized gene expression magnitudes in a region- or cell type-specific manner. We show that these normalization-induced effects may reduce the reliability of downstream analyses including differential gene expression, gene fold change, and spatially variable gene analysis, introducing false positive and false negative results when compared to results obtained from gene panels that are more representative of the gene expression of the tissue’s component cell types. These effects are not observed with normalization approaches that do not use detected gene counts for gene expression magnitude adjustment, such as with cell volume or cell area normalization.

    Conclusions

    We recommend using non-gene count-based normalization approaches when feasible and evaluating gene panel representativeness before using gene count-based normalization methods if necessary. Overall, we caution that the choice of normalization method and gene panel may impact the biological interpretation of the im-SRT data.

     
    more » « less
  2. Abstract Motivation

    Spatial omics data demand computational analysis but many analysis tools have computational resource requirements that increase with the number of cells analyzed. This presents scalability challenges as researchers use spatial omics technologies to profile millions of cells.

    Results

    To enhance the scalability of spatial omics data analysis, we developed a rasterization preprocessing framework called SEraster that aggregates cellular information into spatial pixels. We apply SEraster to both real and simulated spatial omics data prior to spatial variable gene expression analysis to demonstrate that such preprocessing can reduce computational resource requirements while maintaining high performance, including as compared to other down-sampling approaches. We further integrate SEraster with existing analysis tools to characterize cell-type spatial co-enrichment across length scales. Finally, we apply SEraster to enable analysis of a mouse pup spatial omics dataset with over a million cells to identify tissue-level and cell-type-specific spatially variable genes as well as spatially co-enriched cell types that recapitulate expected organ structures.

    Availability and implementation

    SEraster is implemented as an R package on GitHub (https://github.com/JEFworks-Lab/SEraster) with additional tutorials at https://JEF.works/SEraster.

     
    more » « less
  3. Abstract

    Spatial transcriptomics (ST) technologies enable high throughput gene expression characterization within thin tissue sections. However, comparing spatial observations across sections, samples, and technologies remains challenging. To address this challenge, we develop STalign to align ST datasets in a manner that accounts for partially matched tissue sections and other local non-linear distortions using diffeomorphic metric mapping. We apply STalign to align ST datasets within and across technologies as well as to align ST datasets to a 3D common coordinate framework. We show that STalign achieves high gene expression and cell-type correspondence across matched spatial locations that is significantly improved over landmark-based affine alignments. Applying STalign to align ST datasets of the mouse brain to the 3D common coordinate framework from the Allen Brain Atlas, we highlight how STalign can be used to lift over brain region annotations and enable the interrogation of compositional heterogeneity across anatomical structures. STalign is available as an open-source Python toolkit athttps://github.com/JEFworks-Lab/STalignand as Supplementary Software with additional documentation and tutorials available athttps://jef.works/STalign.

     
    more » « less