skip to main content


Title: Towards a unified data infrastructure to support European and global microbiome research: a call to action
Summary

High‐quality microbiome research relies on the integrity, management and quality of supporting data. Currently biobanks and culture collections have different formats and approaches to data management. This necessitates a standard data format to underpin research, particularly in line with the FAIR data standards of findability, accessibility, interoperability and reusability. We address the importance of a unified, coordinated approach that ensures compatibility of data between that needed by biobanks and culture collections, but also to ensure linkage between bioinformatic databases and the wider research community.

 
more » « less
Award ID(s):
1714276
NSF-PAR ID:
10490401
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
society for applied microbiology
Date Published:
Journal Name:
Environmental Microbiology
Volume:
23
Issue:
1
ISSN:
1462-2912
Page Range / eLocation ID:
372 to 375
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Biobanks are important in biomedical and public health research, and future healthcare research relies on their strength and capacity. However, there are financial challenges related to the operation of commercial biobanks and concerns around the commercialization of biobanks. Non-commercial biobanks depend on grant funding to operate and could be valuable to researchers if they can enable access to quality specimens at lower costs. The objective of this study is to estimate the value of specific biobank attributes. We used a rating-based conjoint experiment approach to study how researchers valued handling fee, access, quality, characterization, breadth of consent, access to key endemics, and time taken to fulfil requests. We found that researchers placed the greatest relative importance on the quality of specimens (26%), followed by the characterization of specimens (21%). Researchers with prior experience purchasing biological samples also valued access to key endemic in-country sites (11.6%) and low handling fees (5.5%) in biobanks.

     
    more » « less
  2. Biobanks linked to electronic health records provide rich resources for health‐related research. With improvements in administrative and informatics infrastructure, the availability and utility of data from biobanks have dramatically increased. In this paper, we first aim to characterize the current landscape of available biobanks and to describe specific biobanks, including their place of origin, size, and data types. The development and accessibility of large‐scale biorepositories provide the opportunity to accelerate agnostic searches, expedite discoveries, and conduct hypothesis‐generating studies of disease‐treatment, disease‐exposure, and disease‐gene associations. Rather than designing and implementing a single study focused on a few targeted hypotheses, researchers can potentially use biobanks' existing resources to answer an expanded selection of exploratory questions as quickly as they can analyze them. However, there are many obvious and subtle challenges with the design and analysis of biobank‐based studies. Our second aim is to discuss statistical issues related to biobank research such as study design, sampling strategy, phenotype identification, and missing data. We focus our discussion on biobanks that are linked to electronic health records. Some of the analytic issues are illustrated using data from the Michigan Genomics Initiative and UK Biobank, two biobanks with two different recruitment mechanisms. We summarize the current body of literature for addressing these challenges and discuss some standing open problems. This work complements and extends recent reviews about biobank‐based research and serves as a resource catalog with analytical and practical guidance for statisticians, epidemiologists, and other medical researchers pursuing research using biobanks.

     
    more » « less
  3. Abstract

    Pure bacterial cultures remain essential for detailed experimental and mechanistic studies in microbiome research, and traditional methods to isolate individual bacteria from complex microbial ecosystems are labor-intensive, difficult-to-scale and lack phenotype–genotype integration. Here we describe an open-source high-throughput robotic strain isolation platform for the rapid generation of isolates on demand. We develop a machine learning approach that leverages colony morphology and genomic data to maximize the diversity of microbes isolated and enable targeted picking of specific genera. Application of this platform on fecal samples from 20 humans yields personalized gut microbiome biobanks totaling 26,997 isolates that represented >80% of all abundant taxa. Spatial analysis on >100,000 visually captured colonies reveals cogrowth patterns betweenRuminococcaceae,Bacteroidaceae,CoriobacteriaceaeandBifidobacteriaceaefamilies that suggest important microbial interactions. Comparative analysis of 1,197 high-quality genomes from these biobanks shows interesting intra- and interpersonal strain evolution, selection and horizontal gene transfer. This culturomics framework should empower new research efforts to systematize the collection and quantitative analysis of imaging-based phenotypes with high-resolution genomics data for many emerging microbiome studies.

     
    more » « less
  4. Abstract

    Social media data (SMD) offer researchers new opportunities to leverage those data for their work in broad areas such as public opinion, digital culture, labor trends, and public health. The success of efforts to save SMD for reuse by researchers will depend on aligning data management and archiving practices with evolving norms around the capture, use, sharing, and security of datasets. This paper presents an initial foray into understanding how established practices for managing and preserving data should adapt to demands from researchers who use and reuse SMD, and from people who are subjects in SMD. We examine the data management practices of researchers who use SMD through a survey, and we analyze published articles that used data from Twitter. We discuss how researchers describe their data management practices and how these practices may differ from the management of conventional data types. We explore conceptual, technical, and ethical challenges for data archives based on the similarities and differences between SMD and other types of research data, focusing on the social sciences. Finally, we suggest areas where archives may need to revise policies, practices, and services in order to create secure, persistent, and usable collections of SMD.

     
    more » « less
  5. Abstract

    Freezers with biospecimen deposits became biobanks and later were networked at the pan-European level in 2013 under the Biobanking and BioMolecular Resources Research Infrastructure—European Research Infrastructure Consortium (BBMRI-ERIC). Drawing on document analysis about the BBMRI-ERIC and multi-sited fieldwork with biobankers in Spain from a science and technology studies approach, we explore what biobanks are expected to do and become under the BBMRI-ERIC framework, and how infrastructural transitions promote particular transformations in biobanking practices. The primary purpose of biobanks in Europe is presented as being to become mediators in contemporary biomedical research (global sharing nodes) distribution, and distributed nodes of samples and their associated data. We argue that infrastructural transitions are complicated and heterogeneous, giving rise to unattended local concerns on adjusting their practices to fit into the BBMRI-ERIC framework, even for non-members, as the case of Spain illustrates, where “old practices” of collection and storage are questioned. In this article, we aim to encourage qualitative studies to explore the lags between pan-European policies and prospects, different contextual interpretations, and biobanking reconfigurations as an opportunity to explore what that lag is made of (e.g. tensions with “old practices,” unresolved conflicts with the national agendas, reservations on a possible centralization of the biobanking practices by regional biobanks, lack of funding, etc.). Such research could enrich not only policy guidance, but also the understanding of technoscientific infrastructures’ scalability.

     
    more » « less