skip to main content

Search for: All records

Creators/Authors contains: "Schmitz, Robert J"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 1, 2023
  2. Abstract

    Machine learning approaches have been applied to identify transcription factor (TF)–DNA interaction important for gene regulation and expression. However, due to the enormous search space of the genome, it is challenging to build models capable of surveying entire reference genomes, especially in species where models were not trained. In this study, we surveyed a variety of methods for classification of epigenomics data in an attempt to improve the detection for 12 members of the auxin response factor (ARF)-binding DNAs from maize and soybean as assessed by DNA Affinity Purification and sequencing (DAP-seq). We used the classification for prediction by minimizing the genome search space by only surveying unmethylated regions (UMRs). For identification of DAP-seq-binding events within the UMRs, we achieved 78.72 % accuracy rate across 12 members of ARFs of maize on average by encoding DNA with count vectorization for k-mer with a logistic regression classifier with up-sampling and feature selection. Importantly, feature selection helps to uncover known and potentially novel ARF-binding motifs. This demonstrates an independent method for identification of TF-binding sites. Finally, we tested the model built with maize DAP-seq data and applied it directly to the soybean genome and found high false-negative rates, which accounted formore »more than 40 % across the ARF TFs tested. The findings in this study suggest the potential use of various methods to predict TF–DNA interactions within and between species with varying degrees of success.

    « less
  3. Abstract The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.
  4. Abstract Epialleles are meiotically heritable variations in expression states that are independent from changes in DNA sequence. Although they are common in plant genomes, their molecular origins are unknown. Here we show, using mutant and experimental populations, that epialleles in Arabidopsis thaliana that result from ectopic hypermethylation are due to feedback regulation of pathways that primarily function to maintain DNA methylation at heterochromatin. Perturbations to maintenance of heterochromatin methylation leads to feedback regulation of DNA methylation in genes. Using single base resolution methylomes from epigenetic recombinant inbred lines (epiRIL), we show that epiallelic variation is abundant in euchromatin, yet, associates with QTL primarily in heterochromatin regions. Mapping three-dimensional chromatin contacts shows that genes that are hotspots for ectopic hypermethylation have increases in contact frequencies with regions possessing H3K9me2. Altogether, these data show that feedback regulation of pathways that have evolved to maintain heterochromatin silencing leads to the origins of spontaneous hypermethylated epialleles.
  5. Abstract Background Regulation of chromatin accessibility and transcription are tightly coordinated processes. Studies in yeast and higher eukaryotes have described accessible chromatin regions, but little work has been done in filamentous fungi. Results Here we present a genome-scale characterization of accessible chromatin regions in Neurospora crassa , which revealed characteristic molecular features of accessible and inaccessible chromatin. We present experimental evidence of inaccessibility within heterochromatin regions in Neurospora, and we examine features of both accessible and inaccessible chromatin, including the presence of histone modifications, types of transcription, transcription factor binding, and relative nucleosome turnover rates. Chromatin accessibility is not strictly correlated with expression level. Accessible chromatin regions in the model filamentous fungus Neurospora are characterized the presence of H3K27 acetylation and commonly associated with pervasive non-coding transcription. Conversely, methylation of H3 lysine-36 catalyzed by ASH1 is correlated with inaccessible chromatin within promoter regions. Conclusions: In N. crassa, H3K27 acetylation is the most predictive histone modification for open chromatin. Conversely, our data show that H3K36 methylation is a key marker of inaccessible chromatin in gene-rich regions of the genome. Our data are consistent with an expanded role for H3K36 methylation in intergenic regions of filamentous fungi compared to the model yeasts,more »S. cerevisiae and S. pombe, which lack homologs of the ASH1 methyltransferase.« less
  6. Hufford, M (Ed.)
    Abstract Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histonemore »modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages.« less