Abstract The spectacular radiation of insects has produced a stunning diversity of phenotypes. During the past 250 years, research on insect systematics has generated hundreds of terms for naming and comparing them. In its current form, this terminological diversity is presented in natural language and lacks formalization, which prohibits computer-assisted comparison using semantic web technologies. Here we propose a Model for Describing Cuticular Anatomical Structures (MoDCAS) which incorporates structural properties and positional relationships for standardized, consistent, and reproducible descriptions of arthropod phenotypes. We applied the MoDCAS framework in creating the ontology for the Anatomy of the Insect Skeleto-Muscular system (AISM). The AISM is the first general insect ontology that aims to cover all taxa by providing generalized, fully logical, and queryable, definitions for each term. It was built using the Ontology Development Kit (ODK), which maximizes interoperability with Uberon (Uberon multi-species anatomy ontology) and other basic ontologies, enhancing the integration of insect anatomy into the broader biological sciences. A template system for adding new terms, extending, and linking the AISM to additional anatomical, phenotypic, genetic, and chemical ontologies is also introduced. The AISM is proposed as the backbone for taxon-specific insect ontologies and has potential applications spanning systematic biology and biodiversity informatics, allowing users to (1) use controlled vocabularies and create semi-automated computer-parsable insect morphological descriptions; (2) integrate insect morphology into broader fields of research, including ontology-informed phylogenetic methods, logical homology hypothesis testing, evo-devo studies, and genotype to phenotype mapping; and (3) automate the extraction of morphological data from the literature, enabling the generation of large-scale phenomic data, by facilitating the production and testing of informatic tools able to extract, link, annotate, and process morphological data. This descriptive model and its ontological applications will allow for clear and semantically interoperable integration of arthropod phenotypes in biodiversity studies. 
                        more » 
                        « less   
                    
                            
                            The Ontology of Biological Attributes (OBA)—computational traits for the life sciences
                        
                    
    
            Abstract Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focussed measurable trait data. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1759808
- PAR ID:
- 10420106
- Author(s) / Creator(s):
- ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
- Date Published:
- Journal Name:
- Mammalian Genome
- ISSN:
- 0938-8990
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Mapping the genetic basis of complex traits is critical to uncovering the biological mechanisms that underlie disease and other phenotypes. Genome-wide association studies (GWAS) in humans and quantitative trait locus (QTL) mapping in model organisms can now explain much of the observed heritability in many traits, allowing us to predict phenotype from genotype. However, constraints on power due to statistical confounders in large GWAS and smaller sample sizes in QTL studies still limit our ability to resolve numerous small-effect variants, map them to causal genes, identify pleiotropic effects across multiple traits, and infer non-additive interactions between loci (epistasis). Here, we introduce barcoded bulk quantitative trait locus (BB-QTL) mapping, which allows us to construct, genotype, and phenotype 100,000 offspring of a budding yeast cross, two orders of magnitude larger than the previous state of the art. We use this panel to map the genetic basis of eighteen complex traits, finding that the genetic architecture of these traits involves hundreds of small-effect loci densely spaced throughout the genome, many with widespread pleiotropic effects across multiple traits. Epistasis plays a central role, with thousands of interactions that provide insight into genetic networks. By dramatically increasing sample size, BB-QTL mapping demonstrates the potential of natural variants in high-powered QTL studies to reveal the highly polygenic, pleiotropic, and epistatic architecture of complex traits.more » « less
- 
            null (Ed.)Quantification of the simultaneous contributions of loci to multiple traits, a phenomenon called pleiotropy, is facilitated by the increased availability of high-throughput genotypic and phenotypic data. To understand the prevalence and nature of pleiotropy, the ability of multivariate and univariate genome-wide association study (GWAS) models to distinguish between pleiotropic and non-pleiotropic loci in linkage disequilibrium (LD) first needs to be evaluated. Therefore, we used publicly available maize and soybean genotypic data to simulate multiple pairs of traits that were either (i) controlled by quantitative trait nucleotides (QTNs) on separate chromosomes, (ii) controlled by QTNs in various degrees of LD with each other, or (iii) controlled by a single pleiotropic QTN. We showed that multivariate GWAS could not distinguish between QTNs in LD and a single pleiotropic QTN. In contrast, a unique QTN detection rate pattern was observed for univariate GWAS whenever the simulated QTNs were in high LD or pleiotropic. Collectively, these results suggest that multivariate and univariate GWAS should both be used to infer whether or not causal mutations underlying peak GWAS associations are pleiotropic. Therefore, we recommend that future studies use a combination of multivariate and univariate GWAS models, as both models could be useful for identifying and narrowing down candidate loci with potential pleiotropic effects for downstream biological experiments.more » « less
- 
            Qu, Li-Jia (Ed.)Pleiotropy—when a single gene controls two or more seemingly unrelated traits—has been shown to impact genes with effects on flowering time, leaf architecture, and inflorescence morphology in maize. However, the genome-wide impact of biological pleiotropy across all maize phenotypes is largely unknown. Here, we investigate the extent to which biological pleiotropy impacts phenotypes within maize using GWAS summary statistics reanalyzed from previously published metabolite, field, and expression phenotypes across the Nested Association Mapping population and Goodman Association Panel. Through phenotypic saturation of 120,597 traits, we obtain over 480 million significant quantitative trait nucleotides. We estimate that only 1.56–32.3% of intervals show some degree of pleiotropy. We then assess the relationship between pleiotropy and various biological features such as gene expression, chromatin accessibility, sequence conservation, and enrichment for gene ontology terms. We find very little relationship between pleiotropy and these variables when compared to permuted pleiotropy. We hypothesize that biological pleiotropy of common alleles is not widespread in maize and is highly impacted by nuisance terms such as population structure and linkage disequilibrium. Natural selection on large standing natural variation in maize populations may target wide and large effect variants, leaving the prevalence of detectable pleiotropy relatively low.more » « less
- 
            In recent years, a bioinformatics method for interpreting genome-wide association study (GWAS) data using metabolic pathway analysis has been developed and successfully used to find significant pathways and mechanisms explaining phenotypic traits of interest in plants. However, the many scripts implementing this method were not straightforward to use, had to be customized for each project, required user supervision, and took more than 24 h to process data. PAST (Pathway Association Study Tool), a new implementation of this method, has been developed to address these concerns. PAST has been implemented as a package for the R language. Two user-interfaces are provided; PAST can be run by loading the package in R and calling its methods, or by using an R Shiny guided user interface. In testing, PAST completed analyses in approximately half an hour to one hour by processing data in parallel and produced the same results as the previously developed method. PAST has many user-specified options for maximum customization. Thus, to promote a powerful new pathway analysis methodology that interprets GWAS data to find biological mechanisms associated with traits of interest, we developed a more accessible, efficient, and user-friendly tool. These attributes make PAST accessible to researchers interested in associating metabolic pathways with GWAS datasets to better understand the genetic architecture and mechanisms affecting phenotypes.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    