skip to main content


Title: SoyCSN: Soybean context‐specific network analysis and prediction based on tissue‐specific transcriptome data
Abstract

The Soybean Gene Atlas project provides a comprehensive map for understanding gene expression patterns in major soybean tissues from flower, root, leaf, nodule, seed, and shoot and stem. The RNA‐Seq data generated in the project serve as a valuable resource for discovering tissue‐specific transcriptome behavior of soybean genes in different tissues. We developed a computational pipeline for Soybean context‐specific network (SoyCSN) inference with a suite of prediction tools to analyze, annotate, retrieve, and visualize soybean context‐specific networks at both transcriptome and interactome levels. BicMix and Cross‐Conditions Cluster Detection algorithms were applied to detect modules based on co‐expression relationships across all the tissues. Soybean context‐specific interactomes were predicted by combining soybean tissue gene expression and protein–protein interaction data. Functional analyses of these predicted networks provide insights into soybean tissue specificities. For example, under symbiotic, nitrogen‐fixing conditions, the constructed soybean leaf network highlights the connection between the photosynthesis function and rhizobium–legume symbiosis. SoyCSN data and all its results are publicly available via an interactive web service within the Soybean Knowledge Base (SoyKB) athttp://soykb.org/SoyCSN. SoyCSN provides a useful web‐based access for exploring context specificities systematically in gene regulatory mechanisms and gene relationships for soybean researchers and molecular breeders.

 
more » « less
Award ID(s):
1734145
NSF-PAR ID:
10197239
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Plant Direct
Volume:
3
Issue:
9
ISSN:
2475-4455
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Rice is an important cereal crop, being a staple food for over half of the world's population, and sexual reproduction resulting in grain formation underpins global food security. However, despite considerable research efforts, many of the genes, especially long intergenic non‐codingRNA(lincRNA) genes, involved in sexual reproduction in rice remain uncharacterized. With an increasing number of public resources becoming available, information from different sources can be combined to perform gene functional annotation. We report the development of MCRiceRepGP, a machine learning framework which integrates heterogeneous evidence and employs multicriteria decision analysis and machine learning to predict coding and lincRNA genes involved in sexual reproduction in rice. The rice genome was reannotated using deep‐sequencing transcriptomic data from reproduction‐associated tissue/cell types identifying previously unannotated putative protein‐coding genes and lincRNAs. MCRiceRepGP was used for genome‐wide discovery of sexual reproduction associated coding and lincRNA genes. The protein‐coding and lincRNA genes identified have distinct expression profiles, with a large proportion of lincRNAs reaching maximum expression levels in the sperm cells. Some of the genes are potentially linked to male‐ and female‐specific fertility and heat stress tolerance during the reproductive stage. MCRiceRepGP can be used in combination with other genome‐wide studies, such as genome‐wide association studies, giving greater confidence that the genes identified are associated with the biological process of interest. As more data, especially about mutant plant phenotypes, become available, the power of MCRiceRepGP will grow, providing researchers with a tool to identify candidate genes for future experiments. MCRiceRepGP is available as a web application (http://mcgplannotator.com/MCRiceRepGP/).

     
    more » « less
  2. Martelli, Pier Luigi (Ed.)
    Abstract Motivation Clustering spatial-resolved gene expression is an essential analysis to reveal gene activities in the underlying morphological context by their functional roles. However, conventional clustering analysis does not consider gene expression co-localizations in tissue for detecting spatial expression patterns or functional relationships among the genes for biological interpretation in the spatial context. In this article, we present a convolutional neural network (CNN) regularized by the graph of protein–protein interaction (PPI) network to cluster spatially resolved gene expression. This method improves the coherence of spatial patterns and provides biological interpretation of the gene clusters in the spatial context by exploiting the spatial localization by convolution and gene functional relationships by graph-Laplacian regularization. Results In this study, we tested clustering the spatially variable genes or all expressed genes in the transcriptome in 22 Visium spatial transcriptomics datasets of different tissue sections publicly available from 10× Genomics and spatialLIBD. The results demonstrate that the PPI-regularized CNN constantly detects gene clusters with coherent spatial patterns and significantly enriched by gene functions with the state-of-the-art performance. Additional case studies on mouse kidney tissue and human breast cancer tissue suggest that the PPI-regularized CNN also detects spatially co-expressed genes to define the corresponding morphological context in the tissue with valuable insights. Availability and implementation Source code is available at https://github.com/kuanglab/CNN-PReg. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  3. Abstract

    Understanding global communications among cells requires accurate representation of cell-cell signaling links and effective systems-level analyses of those links. We construct a database of interactions among ligands, receptors and their cofactors that accurately represent known heteromeric molecular complexes. We then develop CellChat, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets. Applying CellChat to mouse and human skin datasets shows its ability to extract complex signaling patterns. Our versatile and easy-to-use toolkit CellChat and a web-based Explorer (http://www.cellchat.org/) will help discover novel intercellular communications and build cell-cell communication atlases in diverse tissues.

     
    more » « less
  4. Abstract

    Recent findings suggest that tree mortality and post‐drought recovery of gas exchange can be predicted from loss of function within the water transport system. Understanding the susceptibility of plants to hydraulic damage requires knowledge about the vulnerability of different plant organs to stress‐induced hydraulic dysfunction. This is particularly important in the context of vulnerability segmentation between plant tissues which is believed to protect more energetically ‘costly’ tissues, such as woody stems, by sacrificing ‘cheaper’ leaves early under drought conditions.

    Differences in vulnerability segmentation between co‐occurring plant species could explain divergent behaviours during drought, yet there are few studies considering how this characteristic may vary within a plant community. Here we investigated community‐wide vulnerability segmentation by comparing leaf/shoot and stem vulnerability in all coexistent dominant canopy and understory woody species in a diverse dry sclerophyll woodland community, including multiple angiosperms and one gymnosperm.

    Previously published terminal leaf/shoot vulnerability to loss of water transport capacity was compared with stem xylem vulnerability to embolism measured on the same species at the same site. We calculated hydraulic safety margins for stems to determine variation in the risk of hydraulic failure during drought among species.

    The xylem of all species was found to be highly resistant to hydraulic dysfunction, with only two of the eight species exhibiting significantly different vulnerability to the overall mean. No evidence of vulnerability segmentation between shoots/leaves and stems was found in seven of the eight species.

    Phylogenetically diverse canopy and understory species in this evergreen sclerophyll woodland appear to have evolved similar strategies of drought resistance, including low xylem vulnerability to embolism and general lack of vulnerability segmentation. This convergence in hydraulic safety indicates a lack of hydraulic niche partitioning in this woodland community.

    A freeplain language summarycan be found within the Supporting Information of this article.

     
    more » « less
  5. Abstract Legume plants such as soybean produce two major types of root lateral organs, lateral roots and root nodules. A robust computational framework was developed to predict potential gene regulatory networks (GRNs) associated with root lateral organ development in soybean. A genome-scale expression data set was obtained from soybean root nodules and lateral roots and subjected to biclustering using QUBIC (QUalitative BIClustering algorithm). Biclusters and transcription factor (TF) genes with enriched expression in lateral root tissues were converged using different network inference algorithms to predict high-confidence regulatory modules that were repeatedly retrieved in different methods. The ranked combination of results from all different network inference algorithms into one ensemble solution identified 21 GRN modules of 182 co-regulated genes networks, potentially involved in root lateral organ development stages in soybean. The workflow correctly predicted previously known nodule- and lateral root-associated TFs including the expected hierarchical relationships. The results revealed distinct high-confidence GRN modules associated with early nodule development involving AP2, GRF5 and C3H family TFs, and those associated with nodule maturation involving GRAS, LBD41 and ARR18 family TFs. Knowledge from this work supported by experimental validation in the future is expected to help determine key gene targets for biotechnological strategies to optimize nodule formation and enhance nitrogen fixation. 
    more » « less