skip to main content


Title: FlyBase: updates to the Drosophila genes and genomes database
Abstract

FlyBase (flybase.org) is a model organism database and knowledge base about Drosophila melanogaster, commonly known as the fruit fly. Researchers from around the world rely on the genetic, genomic, and functional information available in FlyBase, as well as its tools to view and interrogate these data. In this article, we describe the latest developments and updates to FlyBase. These include the introduction of single-cell RNA sequencing data, improved content and display of functional information, updated orthology pipelines, new chemical reports, and enhancements to our outreach resources.

 
more » « less
NSF-PAR ID:
10489027
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
GENETICS
ISSN:
1943-2631
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    FlyBase (www.flybase.org) is the primary online database of genetic, genomic, and functional information aboutDrosophila melanogaster. The long and rich history ofDrosophilaresearch, combined with recent surges in genomic‐scale and high‐throughput technologies, means that FlyBase now houses a huge quantity of data. Researchers need to be able to query these data rapidly and intuitively, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This article describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC.

    Basic Protocol 1: Using the “Search FlyBase” tab of QuickSearch

    Basic Protocol 2: Using the “Data Class” tab of QuickSearch

    Basic Protocol 3: Using the “References” tab of QuickSearch

    Basic Protocol 4: Using the “Gene Groups” tab of QuickSearch

    Basic Protocol 5: Using the “Pathways” tab of QuickSearch

    Basic Protocol 6: Using the “GO” tab of QuickSearch

    Basic Protocol 7: Using the “Protein Domains” tab of QuickSearch

    Basic Protocol 8: Using the “Expression” tab of QuickSearch

    Basic Protocol 9: Using the “GAL4 etc” tab of QuickSearch

    Basic Protocol 10: Using the “Phenotype” tab of QuickSearch

    Basic Protocol 11: Using the “Human Disease” tab of QuickSearch

    Basic Protocol 12: Using the “Homologs” tab of QuickSearch

    Support Protocol 1: Managing FlyBase hit lists

     
    more » « less
  2. null (Ed.)
    Abstract FlyBase (flybase.org) is an essential online database for researchers using Drosophila melanogaster as a model organism, facilitating access to a diverse array of information that includes genetic, molecular, genomic and reagent resources. Here, we describe the introduction of several new features at FlyBase, including Pathway Reports, paralog information, disease models based on orthology, customizable tables within reports and overview displays (‘ribbons’) of expression and disease data. We also describe a variety of recent important updates, including incorporation of a developmental proteome, upgrades to the GAL4 search tab, additional Experimental Tool Reports, migration to JBrowse for genome browsing and improvements to batch queries/downloads and the Fast-Track Your Paper tool. 
    more » « less
  3. Abstract

    FlyBase provides a centralized resource for the genetic and genomic data of Drosophila melanogaster. As FlyBase enters our fourth decade of service to the research community, we reflect on our unique aspects and look forward to our continued collaboration with the larger research and model organism communities. In this study, we emphasize the dedicated reports and tools we have constructed to meet the specialized needs of fly researchers but also to facilitate use by other research communities. We also highlight ways that we support the fly community, including an external resources page, help resources, and multiple avenues by which researchers can interact with FlyBase.

     
    more » « less
  4. Abstract Motivation

    Accurately representing biological networks in a low-dimensional space, also known as network embedding, is a critical step in network-based machine learning and is carried out widely using node2vec, an unsupervised method based on biased random walks. However, while many networks, including functional gene interaction networks, are dense, weighted graphs, node2vec is fundamentally limited in its ability to use edge weights during the biased random walk generation process, thus under-using all the information in the network.

    Results

    Here, we present node2vec+, a natural extension of node2vec that accounts for edge weights when calculating walk biases and reduces to node2vec in the cases of unweighted graphs or unbiased walks. Using two synthetic datasets, we empirically show that node2vec+ is more robust to additive noise than node2vec in weighted graphs. Then, using genome-scale functional gene networks to solve a wide range of gene function and disease prediction tasks, we demonstrate the superior performance of node2vec+ over node2vec in the case of weighted graphs. Notably, due to the limited amount of training data in the gene classification tasks, graph neural networks such as GCN and GraphSAGE are outperformed by both node2vec and node2vec+.

    Availability and implementation

    The data and code are available on GitHub at https://github.com/krishnanlab/node2vecplus_benchmarks. All additional data underlying this article are available on Zenodo at https://doi.org/10.5281/zenodo.7007164.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  5. Abstract

    The house sparrow (Passer domesticus) is a valuable avian model for studying evolutionary genetics, development, neurobiology, physiology, behavior, and ecology, both in laboratory and field-based settings. The current annotation of the P. domesticus genome available at the Ensembl Rapid Release site is primarily focused on gene set building and lacks functional information. In this study, we present the first comprehensive functional reannotation of the P. domesticus genome using intestinal Illumina RNA sequencing (RNA-Seq) libraries. Our revised annotation provides an expanded view of the genome, encompassing 38592 transcripts compared to the current 23574 transcripts in Ensembl. We also predicted 14717 protein-coding genes, achieving 96.4% completeness for Passeriformes lineage BUSCOs. A substantial improvement in this reannotation is the accurate delineation of untranslated region (UTR) sequences. We identified 82.7% and 93.8% of the transcripts containing 5′- and 3′-UTRs, respectively. These UTR annotations are crucial for understanding post-transcriptional regulatory processes. Our findings underscore the advantages of incorporating additional specific RNA-Seq data into genome annotation, particularly when leveraging fast and efficient data processing capabilities. This functional reannotation enhances our understanding of the P. domesticus genome, providing valuable resources for future investigations in various research fields.

     
    more » « less