skip to main content


Title: Data-driven structural analysis of small cell lung cancer transcription factor network suggests potential subtype regulators and transition pathways
Abstract

Small cell lung cancer (SCLC) is an aggressive disease and challenging to treat due to its mixture of transcriptional subtypes and subtype transitions. Transcription factor (TF) networks have been the focus of studies to identify SCLC subtype regulators via systems approaches. Yet, their structures, which can provide clues on subtype drivers and transitions, are barely investigated. Here, we analyze the structure of an SCLC TF network by using graph theory concepts and identify its structurally important components responsible for complex signal processing, called hubs. We show that the hubs of the network are regulators of different SCLC subtypes by analyzing first the unbiased network structure and then integrating RNA-seq data as weights assigned to each interaction. Data-driven analysis emphasizes MYC as a hub, consistent with recent reports. Furthermore, we hypothesize that the pathways connecting functionally distinct hubs may control subtype transitions and test this hypothesis via network simulations on a candidate pathway and observe subtype transition. Overall, structural analyses of complex networks can identify their functionally important components and pathways driving the network dynamics. Such analyses can be an initial step for generating hypotheses and can guide the discovery of target pathways whose perturbation may change the network dynamics phenotypically.

 
more » « less
NSF-PAR ID:
10471986
Author(s) / Creator(s):
;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
npj Systems Biology and Applications
Volume:
9
Issue:
1
ISSN:
2056-7189
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract: In the learning sciences, heterogeneity among students usually leads to different learning strategies or patterns and may require different types of instructional interventions. Therefore, it is important to investigate student subtyping, which is to group students into subtypes based on their learning patterns. Subtyping from complex student learning processes is often challenging because of the information heterogeneity and temporal dynamics. Various inverse reinforcement learning (IRL) algorithms have been successfully employed in many domains for inducing policies from the trajectories and recently has been applied for analyzing students’ temporal logs to identify their domain knowledge patterns. IRL was originally designed to model the data by assuming that all trajectories have a single pattern or strategy. Due to the heterogeneity among students, their strategies can vary greatly and the design of traditional IRL may lead to suboptimal performance. In this paper, we applied a novel expectation-maximization IRL (EM-IRL) to extract heterogeneous learning strategies from sequential data collected from three simulation environments and real-world longitudinal students’ logs. Experiments on simulation environments showed that EM-IRL can successfully identify different policies from the heterogeneous sequences with different strategies. Furthermore, experimental results from our educational dataset showed that EM-IRL can be used to obtain different student subtypes: a “learning-oriented” subtype who learned the material as much as possible regardless of the time in that they spent significantly more time than the other two subtypes and learned significantly; an“efficient-oriented”subtype who learned efficiently in that they not only learned significantly but also spent less time than the first subtype; a “no learning” subtype who spent less amount of time than first subtype and failed to learn. 
    more » « less
  2. Abstract

    Type VIIb secretion systems (T7SSb) in Gram‐positive bacteria facilitate physiology, interbacterial competition, and/or virulence via EssC ATPase‐driven secretion of small ɑ‐helical proteins and toxins. Recently, we characterized T7SSb in group BStreptococcus(GBS), a leading cause of infection in newborns and immunocompromised adults. GBS T7SS comprises four subtypes based on variation in the C‐terminus of EssC and the repertoire of downstream effectors; however, the intraspecies diversity of GBS T7SS and impact on GBS‐host interactions remains unknown. Bioinformatic analysis indicates that GBS T7SS loci encode subtype‐specific putative effectors, which have low interspecies and inter‐subtype homology but contain similar domains/motifs and therefore may serve similar functions. We further identify orphaned GBS WXG100 proteins. Functionally, we show that GBS T7SS subtype I and III strains secrete EsxA in vitro and that in subtype I strain CJB111,esxA1appears to be differentially transcribed from the T7SS operon. Furthermore, we observe subtype‐specific effects of GBS T7SS on host colonization, as CJB111 subtype I but not CNCTC 10/84 subtype III T7SS promotes GBS vaginal colonization. Finally, we observe that T7SS subtypes I and II are the predominant subtypes in clinical GBS isolates. This study highlights the potential impact of T7SS heterogeneity on host‐GBS interactions.

     
    more » « less
  3. Abstract Aims

    Dissecting complex interactions among transcription factors (TFs), microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) are central for understanding heart development and function. Although computational approaches and platforms have been described to infer relationships among regulatory factors and genes, current approaches do not adequately account for how highly diverse, interacting regulators that include noncoding RNAs (ncRNAs) control cardiac gene expression dynamics over time.

    Methods

    To overcome this limitation, we devised an integrated framework, cardiac gene regulatory modeling (CGRM) that integrates LogicTRN and regulatory component analysis bioinformatics modeling platforms to infer complex regulatory mechanisms. We then used CGRM to identify and compare the TF-ncRNA gene regulatory networks that govern early- and late-stage cardiomyocytes (CMs) generated by in vitro differentiation of human pluripotent stem cells (hPSC) and ventricular and atrial CMs isolated during in vivo human cardiac development.

    Results

    Comparisons of in vitro versus in vivo derived CMs revealed conserved regulatory networks among TFs and ncRNAs in early cells that significantly diverged in late staged cells. We report that cardiac genes (“heart targets”) expressed in early-stage hPSC-CMs are primarily regulated by MESP1, miR-1, miR-23, lncRNAs NEAT1 and MALAT1, while GATA6, HAND2, miR-200c, NEAT1 and MALAT1 are critical for late hPSC-CMs. The inferred TF-miRNA-lncRNA networks regulating heart development and contraction were similar among early-stage CMs, among individual hPSC-CM datasets and between in vitro and in vivo samples. However, genes related to apoptosis, cell cycle and proliferation, and transmembrane transport showed a high degree of divergence between in vitro and in vivo derived late-stage CMs. Overall, late-, but not early-stage CMs diverged greatly in the expression of “heart target” transcripts and their regulatory mechanisms.

    Conclusions

    In conclusion, we find that hPSC-CMs are regulated in a cell autonomous manner during early development that diverges significantly as a function of time when compared to in vivo derived CMs. These findings demonstrate the feasibility of using CGRM to reveal dynamic and complex transcriptional and posttranscriptional regulatory interactions that underlie cell directed versus environment-dependent CM development. These results with in vitro versus in vivo derived CMs thus establish this approach for detailed analyses of heart disease and for the analysis of cell regulatory systems in other biomedical fields.

     
    more » « less
  4. Drought is one of the most serious abiotic stressors in the environment, restricting agricultural production by reducing plant growth, development, and productivity. To investigate such a complex and multifaceted stressor and its effects on plants, a systems biology-based approach is necessitated, entailing the generation of co-expression networks, identification of high-priority transcription factors (TFs), dynamic mathematical modeling, and computational simulations. Here, we studied a high-resolution drought transcriptome of Arabidopsis. We identified distinct temporal transcriptional signatures and demonstrated the involvement of specific biological pathways. Generation of a large-scale co-expression network followed by network centrality analyses identified 117 TFs that possess critical properties of hubs, bottlenecks, and high clustering coefficient nodes. Dynamic transcriptional regulatory modeling of integrated TF targets and transcriptome datasets uncovered major transcriptional events during the course of drought stress. Mathematical transcriptional simulations allowed us to ascertain the activation status of major TFs, as well as the transcriptional intensity and amplitude of their target genes. Finally, we validated our predictions by providing experimental evidence of gene expression under drought stress for a set of four TFs and their major target genes using qRT-PCR. Taken together, we provided a systems-level perspective on the dynamic transcriptional regulation during drought stress in Arabidopsis and uncovered numerous novel TFs that could potentially be used in future genetic crop engineering programs. 
    more » « less
  5. Abstract Skin disorders are one of the most common complications of type II diabetes (T2DM). Long-term effects of high blood glucose leave individuals with T2DM more susceptible to cutaneous diseases, but its underlying molecular mechanisms are unclear. Network-based methods consider the complex interactions between genes which can complement the analysis of single genes in previous research. Here, we use network analysis and topological properties to systematically investigate dysregulated gene co-expression patterns in type II diabetic skin with skin samples from the Genotype-Tissue Expression database. Our final network consisted of 8812 genes from 73 subjects with T2DM and 147 non-T2DM subjects matched for age, sex, and race. Two gene modules significantly related to T2DM were functionally enriched in the pathway lipid metabolism, activated by PPARA and SREBF ( SREBP ). Transcription factors KLF10 , KLF4 , SP1 , and microRNA-21 were predicted to be important regulators of gene expression in these modules. Intramodular analysis and betweenness centrality identified NCOA6 as the hub gene while KHSRP and SIN3B are key coordinators that influence molecular activities differently between T2DM and non-T2DM populations. We built a TF-miRNA-mRNA regulatory network to reveal the novel mechanism ( miR-21-PPARA-NCOA6 ) of dysregulated keratinocyte proliferation, differentiation, and migration in diabetic skin, which may provide new insights into the susceptibility of skin disorders in T2DM patients. Hub genes and key coordinators may serve as therapeutic targets to improve diabetic skincare. 
    more » « less