skip to main content

Title: rTASSEL: an R interface to TASSEL for association mapping of complex traits
Summary The need for efficient tools and applications for analyzing genomic diversity is essential for any genetics research program. One such tool, TASSEL (Trait Analysis by aSSociation, Evolution and Linkage), provides many core methods for genomic analyses. Despite its efficiency, TASSEL has limited means to use scripting languages for reproducible research and interacting with other analytical tools. Here we present an R package rTASSEL, a front-end to connect to a variety of highly used TASSEL methods and analytical tools. The goal of this package is to create a unified scripting workflow that exploits the analytical prowess of TASSEL in conjunction with R’s popular data handling and parsing capabilities without ever having the user to switch between these two environments. By implementing this workflow, we can achieve performances ranging from approximately 2 to 20 times faster than other widely used R packages for various functionalities. Availability and implementation rTASSEL is implemented in R using core TASSEL methods written in Java. The source code for rTASSEL can be found at The source code for TASSEL can be found at  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Alkan, Can (Ed.)
    Abstract Motivation

    Genome browsers are an essential tool in genome analysis. Modern genome browsers enable complex and interactive visualization of a wide variety of genomic data modalities. While such browsers are very powerful, they can be challenging to configure and program for bioinformaticians lacking expertise in web development.


    We have developed an R package that provides an interface to the JBrowse 2 genome browser. The package can be used to configure and customize the browser entirely with R code. The browser can be deployed from the R console, or embedded in Shiny applications or R Markdown documents.

    Availability and implementation

    JBrowseR is available for download from CRAN, and the source code is openly available from the Github repository at

    more » « less
  2. Abstract

    Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5′end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at

    more » « less
  3. Many previous studies have shown that open-source technologies help democratize information and foster collaborations to enable addressing global physical and societal challenges. The outbreak of the novel coronavirus has imposed unprecedented challenges to human society. It affects every aspect of livelihood, including health, environment, transportation, and economy. Open-source technologies provide a new ray of hope to collaboratively tackle the pandemic. The role of open source is not limited to sharing a source code. Rather open-source projects can be adopted as a software development approach to encourage collaboration among researchers. Open collaboration creates a positive impact in society and helps combat the pandemic effectively. Open-source technology integrated with geospatial information allows decision-makers to make strategic and informed decisions. It also assists them in determining the type of intervention needed based on geospatial information. The novelty of this paper is to standardize the open-source workflow for spatiotemporal research. The highlights of the open-source workflow include sharing data, analytical tools, spatiotemporal applications, and results and formalizing open-source software development. The workflow includes (i) developing open-source spatiotemporal applications, (ii) opening and sharing the spatiotemporal resources, and (iii) replicating the research in a plug and play fashion. Open data, open analytical tools and source code, and publicly accessible results form the foundation for this workflow. This paper also presents a case study with the open-source spatiotemporal application development for air quality analysis in California, USA. In addition to the application development, we shared the spatiotemporal data, source code, and research findings through the GitHub repository. 
    more » « less
  4. Grueber, Catherine E (Ed.)

    Landscape genomics can harness environmental and genetic data to inform conservation decisions by providing essential insights into how landscapes shape biodiversity. The massive increase in genetic data afforded by the genomic era provides exceptional resolution for answering critical conservation genetics questions. The accessibility of genomic data for non‐model systems has also enabled a shift away from population‐based sampling to individual‐based sampling, which now provides accurate and robust estimates of genetic variation that can be used to examine the spatial structure of genomic diversity, population connectivity and the nature of environmental adaptation. Nevertheless, the adoption of individual‐based sampling in conservation genetics has been slowed due, in large part, to concerns over how to apply methods developed for population‐based sampling to individual‐based sampling schemes. Here, we discuss the benefits of individual‐based sampling for conservation and describe how landscape genomic methods, paired with individual‐based sampling, can answer fundamental conservation questions. We have curated key landscape genomic methods into a user‐friendly, open‐source workflow, which we provide as a new R package, A Landscape Genomics Analysis Toolkit in R (algatr). Thealgatrpackage includes novel added functionality for all of the included methods and extensive vignettes designed with the primary goal of making landscape genomic approaches more accessible and explicitly applicable to conservation biology.

    more » « less
  5. Abstract

    Gene co‐expression analysis is an effective method to detect groups (or modules) of co‐expressed genes that display similar expression patterns, which may function in the same biological processes. Here, we present “Simple Tidy GeneCoEx”, a gene co‐expression analysis workflow written in the R programming language. The workflow is highly customizable across multiple stages of the pipeline including gene selection, edge selection, clustering resolution, and data visualization. Powered by the tidyverse package ecosystem and network analysis functions provided by the igraph package, the workflow detects gene co‐expression modules whose members are highly interconnected. Step‐by‐step instructions with two use case examples as well as source code are available at

    more » « less