skip to main content


Title: Preserving US microbe collections sparks future discoveries
Summary

Collections of micro-organisms are a crucial element of life science research infrastructure but are vulnerable to loss and damage caused by natural or man-made disasters, the untimely death or retirement of personnel, or the loss of research funding. Preservation of biological collections has risen in priority due to a new appreciation for discoveries linked to preserved specimens, emerging hurdles to international collecting and decreased funding for new collecting. While many historic collections have been lost, several have been preserved, some with dramatic rescue stories. Rescued microbes have been used for discoveries in areas of health, biotechnology and basic life science. Suggestions for long-term planning for microbial stocks are listed, as well as inducements for long-term preservation.

 
more » « less
Award ID(s):
1755220 1756217
NSF-PAR ID:
10389434
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of Applied Microbiology
Volume:
129
Issue:
2
ISSN:
1364-5072
Page Range / eLocation ID:
p. 162-174
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Coral reefs are declining worldwide primarily because of bleaching and subsequent mortality resulting from thermal stress. Currently, extensive efforts to engage in more holistic research and restoration endeavors have considerably expanded the techniques applied to examine coral samples. Despite such advances, coral bleaching and restoration studies are often conducted within a specific disciplinary focus, where specimens are collected, preserved, and archived in ways that are not always conducive to further downstream analyses by specialists in other disciplines. This approach may prevent the full utilization of unexpended specimens, leading to siloed research, duplicative efforts, unnecessary loss of additional corals to research endeavors, and overall increased costs. A recent US National Science Foundation-sponsored workshop set out to consolidate our collective knowledge across the disciplines of Omics, Physiology, and Microscopy and Imaging regarding the methods used for coral sample collection, preservation, and archiving. Here, we highlight knowledge gaps and propose some simple steps for collecting, preserving, and archiving coral-bleaching specimens that can increase the impact of individual coral bleaching and restoration studies, as well as foster additional analyses and future discoveries through collaboration. Rapid freezing of samples in liquid nitrogen or placing at −80 °C to −20 °C is optimal for most Omics and Physiology studies with a few exceptions; however, freezing samples removes the potential for many Microscopy and Imaging-based analyses due to the alteration of tissue integrity during freezing. For Microscopy and Imaging, samples are best stored in aldehydes. The use of sterile gloves and receptacles during collection supports the downstream analysis of host-associated bacterial and viral communities which are particularly germane to disease and restoration efforts. Across all disciplines, the use of aseptic techniques during collection, preservation, and archiving maximizes the research potential of coral specimens and allows for the greatest number of possible downstream analyses. 
    more » « less
  2. Abstract

    Long‐term datasets are needed to evaluate temporal patterns in wildlife disease burdens, but historical data on parasite abundance are extremely rare. For more than a century, natural history collections have been accumulating fluid‐preserved specimens, which should contain the parasites infecting the host at the time of its preservation. However, before this unique data source can be exploited, we must identify the artifacts that are introduced by the preservation process. Here, we experimentally address whether the preservation process alters the degree to which metazoan parasites are detectable in fluid‐preserved fish specimens when using visual parasite detection techniques. We randomly assigned fish of three species (Gadus chalcogrammus, Thaleichthys pacificus, and Parophrys vetulus) to two treatments. In the first treatment, fish were preserved according to the standard procedures used in ichthyological collections. Immediately after the fluid‐preservation process was complete, we performed parasitological dissection on those specimens. The second treatment was a control, in which fish were dissected without being subjected to the fluid‐preservation process. We compared parasite abundance between the two treatments. Across 298 fish individuals and 59 host–parasite pairs, we found few differences between treatments, with 24 of 27 host–parasite pairs equally abundant between the two treatments. Of these, one pair was significantly more abundant in the preservation treatment than in the control group, and two pairs were significantly less abundant in the preservation treatment than in the control group. Our data suggest that the fluid‐preservation process does not have a substantial effect on the detectability of metazoan parasites. This study addresses only the effects of the fixation and preservation process; long‐term experiments are needed to address whether parasite detectability remains unchanged in the months, years, and decades of storage following preservation. If so, ecologists will be able to reconstruct novel, long‐term datasets on parasite diversity and abundance over the past century or more using fluid‐preserved specimens from natural history collections.

     
    more » « less
  3. Abstract

    Museum fluid collections preserve important biological specimens for study. Tissues are often fixed in 10% buffered formalin to halt metabolic activities and transferred to a solution of ethanol for long‐term storage. This process, however, forces water from the tissues and has been shown to alter the morphology of preserved specimens in ways that may influence the biological interpretation of results. The degree to which fluid preservation alters morphology is linked to multiple biological factors, such as tissue size and composition, and should therefore be examined prior to functional analysis. This study is undertaken as part of a more inclusive examination of mammalian volar morphology. A sample of five adult male and five adult female rats (Rattus norvegicus) was utilized to evaluate longitudinal changes in the dimensions of the volar pads across fixation in 10% buffered formalin and preservation in 70% ethanol for 1 year. No significant changes to the measured dimensions of the rat volar pads were present across stages of fixation and preservation, and no significant interactions of specimen size or sex were noted. These findings indicate that small mammalian volar pads that have been fixed in 10% buffered formalin and stored in 70% ethanol are appropriate for morphological study using the measurements described here without corrective algorithms. This finding is rare among preservation studies but highlights the variability of tissue behavior during chemical preservation and the necessity of preliminary investigations of preservation artifacts. Concurrence here between the preserved and unpreserved samples is likely related to the anhydrous nature of the volar pads and the supporting skeletal structure, and their confined position between major joints of the hands and feet.

     
    more » « less
  4. Obeid, Iyad ; Picone, Joseph ; Selesnick, Ivan (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing a large open source database of high-resolution digital pathology images known as the Temple University Digital Pathology Corpus (TUDP) [1]. Our long-term goal is to release one million images. We expect to release the first 100,000 image corpus by December 2020. The data is being acquired at the Department of Pathology at Temple University Hospital (TUH) using a Leica Biosystems Aperio AT2 scanner [2] and consists entirely of clinical pathology images. More information about the data and the project can be found in Shawki et al. [3]. We currently have a National Science Foundation (NSF) planning grant [4] to explore how best the community can leverage this resource. One goal of this poster presentation is to stimulate community-wide discussions about this project and determine how this valuable resource can best meet the needs of the public. The computing infrastructure required to support this database is extensive [5] and includes two HIPAA-secure computer networks, dual petabyte file servers, and Aperio’s eSlide Manager (eSM) software [6]. We currently have digitized over 50,000 slides from 2,846 patients and 2,942 clinical cases. There is an average of 12.4 slides per patient and 10.5 slides per case with one report per case. The data is organized by tissue type as shown below: Filenames: tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_0a001_00123456_lvl0001_s000.svs tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_00123456.docx Explanation: tudp: root directory of the corpus v1.0.0: version number of the release svs: the image data type gastro: the type of tissue 000001: six-digit sequence number used to control directory complexity 00123456: 8-digit patient MRN 2015_03_05: the date the specimen was captured 0s15_12345: the clinical case name 0s15_12345_0a001_00123456_lvl0001_s000.svs: the actual image filename consisting of a repeat of the case name, a site code (e.g., 0a001), the type and depth of the cut (e.g., lvl0001) and a token number (e.g., s000) 0s15_12345_00123456.docx: the filename for the corresponding case report We currently recognize fifteen tissue types in the first installment of the corpus. The raw image data is stored in Aperio’s “.svs” format, which is a multi-layered compressed JPEG format [3,7]. Pathology reports containing a summary of how a pathologist interpreted the slide are also provided in a flat text file format. A more complete summary of the demographics of this pilot corpus will be presented at the conference. Another goal of this poster presentation is to share our experiences with the larger community since many of these details have not been adequately documented in scientific publications. There are quite a few obstacles in collecting this data that have slowed down the process and need to be discussed publicly. Our backlog of slides dates back to 1997, meaning there are a lot that need to be sifted through and discarded for peeling or cracking. Additionally, during scanning a slide can get stuck, stalling a scan session for hours, resulting in a significant loss of productivity. Over the past two years, we have accumulated significant experience with how to scan a diverse inventory of slides using the Aperio AT2 high-volume scanner. We have been working closely with the vendor to resolve many problems associated with the use of this scanner for research purposes. This scanning project began in January of 2018 when the scanner was first installed. The scanning process was slow at first since there was a learning curve with how the scanner worked and how to obtain samples from the hospital. From its start date until May of 2019 ~20,000 slides we scanned. In the past 6 months from May to November we have tripled that number and how hold ~60,000 slides in our database. This dramatic increase in productivity was due to additional undergraduate staff members and an emphasis on efficient workflow. The Aperio AT2 scans 400 slides a day, requiring at least eight hours of scan time. The efficiency of these scans can vary greatly. When our team first started, approximately 5% of slides failed the scanning process due to focal point errors. We have been able to reduce that to 1% through a variety of means: (1) best practices regarding daily and monthly recalibrations, (2) tweaking the software such as the tissue finder parameter settings, and (3) experience with how to clean and prep slides so they scan properly. Nevertheless, this is not a completely automated process, making it very difficult to reach our production targets. With a staff of three undergraduate workers spending a total of 30 hours per week, we find it difficult to scan more than 2,000 slides per week using a single scanner (400 slides per night x 5 nights per week). The main limitation in achieving this level of production is the lack of a completely automated scanning process, it takes a couple of hours to sort, clean and load slides. We have streamlined all other aspects of the workflow required to database the scanned slides so that there are no additional bottlenecks. To bridge the gap between hospital operations and research, we are using Aperio’s eSM software. Our goal is to provide pathologists access to high quality digital images of their patients’ slides. eSM is a secure website that holds the images with their metadata labels, patient report, and path to where the image is located on our file server. Although eSM includes significant infrastructure to import slides into the database using barcodes, TUH does not currently support barcode use. Therefore, we manage the data using a mixture of Python scripts and manual import functions available in eSM. The database and associated tools are based on proprietary formats developed by Aperio, making this another important point of community-wide discussion on how best to disseminate such information. Our near-term goal for the TUDP Corpus is to release 100,000 slides by December 2020. We hope to continue data collection over the next decade until we reach one million slides. We are creating two pilot corpora using the first 50,000 slides we have collected. The first corpus consists of 500 slides with a marker stain and another 500 without it. This set was designed to let people debug their basic deep learning processing flow on these high-resolution images. We discuss our preliminary experiments on this corpus and the challenges in processing these high-resolution images using deep learning in [3]. We are able to achieve a mean sensitivity of 99.0% for slides with pen marks, and 98.9% for slides without marks, using a multistage deep learning algorithm. While this dataset was very useful in initial debugging, we are in the midst of creating a new, more challenging pilot corpus using actual tissue samples annotated by experts. The task will be to detect ductal carcinoma (DCIS) or invasive breast cancer tissue. There will be approximately 1,000 images per class in this corpus. Based on the number of features annotated, we can train on a two class problem of DCIS or benign, or increase the difficulty by increasing the classes to include DCIS, benign, stroma, pink tissue, non-neoplastic etc. Those interested in the corpus or in participating in community-wide discussions should join our listserv, nedc_tuh_dpath@googlegroups.com, to be kept informed of the latest developments in this project. You can learn more from our project website: https://www.isip.piconepress.com/projects/nsf_dpath. 
    more » « less
  5. Abstract

    It has become common for researchers to make their data publicly available to meet the data management and accessibility requirements of funding agencies and scientific publishers. However, many researchers face the challenge of determining what data to preserve and share and where to preserve and share those data. This can be especially challenging for those who run dynamical models, which can produce complex, voluminous data outputs, and have not considered what outputs may need to be preserved and shared as part of the project design. This manuscript presents findings from the NSF EarthCube Research Coordination Network project titled “What About Model Data? Best Practices for Preservation and Replicability” (https://modeldatarcn.github.io/). These findings suggest that if the primary goal of sharing data are to communicate knowledge, most simulation-based research projects only need to preserve and share selected model outputs along with the full simulation experiment workflow. One major result of this project has been the development of a rubric, designed to provide guidance for making decisions on what simulation output needs to be preserved and shared in trusted community repositories to achieve the goal of knowledge communication. This rubric, along with use cases for selected projects, provide scientists with guidance on data accessibility requirements in the planning process of research, allowing for more thoughtful development of data management plans and funding requests. Additionally, this rubric can be referred to by publishers for what is expected in terms of data accessibility for publication.

     
    more » « less