The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, September 29 until 6:00 AM ET on Saturday, September 30 due to maintenance. We apologize for the inconvenience.
Explore Scholarly Publications and Datasets in the NSF-PAR
Title: Terrestrial Parasite Tracker indexed biotic interactions and review summary
Abstract
<p>PLEASE CONTACT AUTHORS IF YOU CONTRIBUTE AND WOULD LIKE TO BE LISTED AS A CO-AUTHOR. (this message will be removed some time weeks/months after the first publication)</p> <p>Terrestrial Parasite
Tracker indexed biotic interactions and review summary.</p> <p>The Terrestrial Parasite Tracker (TPT) project began in 2019 and is funded by the National Science foundation to mobilize data from vector and ectoparasite collections to data aggregators (e.g., iDigBio, GBIF) to help build a comprehensive picture of arthropod host-association evolution, distributions, and the ecological interactions of disease vectors which will assist scientists, educators, land managers, and policy makers. Arthropod parasites often are important to human and wildlife health and safety as vectors of pathogens, and it is critical to digitize these specimens so that they, and their biotic interaction data, will be available to help understand and predict the spread of human and wildlife disease.</p> <p>This data publication contains versioned TPT associated datasets and related data products that were tracked, reviewed and indexed by Global Biotic Interactions (GloBI) and associated tools. GloBI provides open access to finding species interaction data (e.g., predator-prey, pollinator-plant, pathogen-host, parasite-host) by combining existing open datasets using open source software.</p> <p>If you have questions or comments about this publication, please open an issue at https://github.com/ParasiteTracker/tpt-reporting or contact the authors by email.</p> <p>Funding:<br /> The creation of this archive was made possible by the National Science Foundation award "Collaborative Research: Digitization TCN: Digitizing collections to trace parasite-host associations and predict the spread of vector-borne disease," Award numbers DBI:1901932 and DBI:1901926</p> <p>References:<br /> Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.</p> <p>GloBI Data Review Report</p> <p>Datasets under review:<br /> - University of Michigan Museum of Zoology Insect Division. Full Database Export 2020-11-20 provided by Erika Tucker and Barry Oconner. accessed via https://github.com/EMTuckerLabUMMZ/ummzi/archive/6731357a377e9c2748fc931faa2ff3dc0ce3ea7a.zip on 2022-06-24T14:02:48.801Z<br /> - Academy of Natural Sciences Entomology Collection for the Parasite Tracker Project accessed via https://github.com/globalbioticinteractions/ansp-para/archive/5e6592ad09ec89ba7958266ad71ec9d5d21d1a44.zip on 2022-06-24T14:04:22.091Z<br /> - Bernice Pauahi Bishop Museum, J. Linsley Gressitt Center for Research in Entomology accessed via https://github.com/globalbioticinteractions/bpbm-ent/archive/c085398dddd36f8a1169b9cf57de2a572229341b.zip on 2022-06-24T14:04:37.692Z<br /> - Texas A&M University, Biodiversity Teaching and Research Collections accessed via https://github.com/globalbioticinteractions/brtc-para/archive/f0a718145b05ed484c4d88947ff712d5f6395446.zip on 2022-06-24T14:06:40.154Z<br /> - Brigham Young University Arthropod Museum accessed via https://github.com/globalbioticinteractions/byu-byuc/archive/4a609ac6a9a03425e2720b6cdebca6438488f029.zip on 2022-06-24T14:06:51.420Z<br /> - California Academy of Sciences Entomology accessed via https://github.com/globalbioticinteractions/cas-ent/archive/562aea232ec74ab615f771239451e57b057dc7c0.zip on 2022-06-24T14:07:16.371Z<br /> - Clemson University Arthropod Collection accessed via https://github.com/globalbioticinteractions/cu-cuac/archive/6cdcbbaa4f7cec8e1eac705be3a999bc5259e00f.zip on 2022-06-24T14:07:40.925Z<br /> - Denver Museum of Nature and Science (DMNS) Parasite specimens (DMNS:Para) accessed via https://github.com/globalbioticinteractions/dmns-para/archive/a037beb816226eb8196533489ee5f98a6dfda452.zip on 2022-06-24T14:08:00.730Z<br /> - Field Museum of Natural History IPT accessed via https://github.com/globalbioticinteractions/fmnh/archive/6bfc1b7e46140e93f5561c4e837826204adb3c2f.zip on 2022-06-24T14:18:51.995Z<br /> - Illinois Natural History Survey Insect Collection accessed via https://github.com/globalbioticinteractions/inhs-insects/archive/38692496f590577074c7cecf8ea37f85d0594ae1.zip on 2022-06-24T14:19:37.563Z<br /> - UMSP / University of Minnesota / University of Minnesota Insect Collection accessed via https://github.com/globalbioticinteractions/min-umsp/archive/3f1b9d32f947dcb80b9aaab50523e097f0e8776e.zip on 2022-06-24T14:20:27.232Z<br /> - Milwaukee Public Museum Biological Collections Data Portal accessed via https://github.com/globalbioticinteractions/mpm/archive/9f44e99c49ec5aba3f8592cfced07c38d3223dcd.zip on 2022-06-24T14:20:46.185Z<br /> - Museum for Southern Biology (MSB) Parasite Collection accessed via https://github.com/globalbioticinteractions/msb-para/archive/178a0b7aa0a8e14b3fe953e770703fe331eadacc.zip on 2022-06-24T15:16:07.223Z<br /> - The Albert J. Cook Arthropod Research Collection accessed via https://github.com/globalbioticinteractions/msu-msuc/archive/38960906380443bd8108c9e44aeff4590d8d0b50.zip on 2022-06-24T16:09:40.702Z<br /> - Ohio State University Acarology Laboratory accessed via https://github.com/globalbioticinteractions/osal-ar/archive/876269d66a6a94175dbb6b9a604897f8032b93dd.zip on 2022-06-24T16:10:00.281Z<br /> - Frost Entomological Museum, Pennsylvania State University accessed via https://github.com/globalbioticinteractions/psuc-ento/archive/30b1f96619a6e9f10da18b42fb93ff22cc4f72e2.zip on 2022-06-24T16:10:07.741Z<br /> - Purdue Entomological Research Collection accessed via https://github.com/globalbioticinteractions/pu-perc/archive/e0909a7ca0a8df5effccb288ba64b28141e388ba.zip on 2022-06-24T16:10:26.654Z<br /> - Texas A&M University Insect Collection accessed via https://github.com/globalbioticinteractions/tamuic-ent/archive/f261a8c192021408da67c39626a4aac56e3bac41.zip on 2022-06-24T16:10:58.496Z<br /> - University of California Santa Barbara Invertebrate Zoology Collection accessed via https://github.com/globalbioticinteractions/ucsb-izc/archive/825678ad02df93f6d4469f9d8b7cc30151b9aa45.zip on 2022-06-24T16:12:29.854Z<br /> - University of Hawaii Insect Museum accessed via https://github.com/globalbioticinteractions/uhim/archive/53fa790309e48f25685e41ded78ce6a51bafde76.zip on 2022-06-24T16:12:41.408Z<br /> - University of New Hampshire Collection of Insects and other Arthropods UNHC-UNHC accessed via https://github.com/globalbioticinteractions/unhc/archive/f72575a72edda8a4e6126de79b4681b25593d434.zip on 2022-06-24T16:12:59.500Z<br /> - Scott L. Gardner and Gabor R. Racz (2021). University of Nebraska State Museum - Parasitology. Harold W. Manter Laboratory of Parasitology. University of Nebraska State Museum. accessed via https://github.com/globalbioticinteractions/unl-nsm/archive/6bcd8aec22e4309b7f4e8be1afe8191d391e73c6.zip on 2022-06-24T16:13:06.914Z<br /> - Data were obtained from specimens belonging to the United States National Museum of Natural History (USNM), Smithsonian Institution, Washington DC and digitized by the Walter Reed Biosystematics Unit (WRBU). accessed via https://github.com/globalbioticinteractions/usnmentflea/archive/ce5cb1ed2bbc13ee10062b6f75a158fd465ce9bb.zip on 2022-06-24T16:13:38.013Z<br /> - US National Museum of Natural History Ixodes Records accessed via https://github.com/globalbioticinteractions/usnm-ixodes/archive/c5fcd5f34ce412002783544afb628a33db7f47a6.zip on 2022-06-24T16:13:45.666Z<br /> - Price Institute of Parasite Research, School of Biological Sciences, University of Utah accessed via https://github.com/globalbioticinteractions/utah-piper/archive/43da8db550b5776c1e3d17803831c696fe9b8285.zip on 2022-06-24T16:13:54.724Z<br /> - University of Wisconsin Stevens Point, Stephen J. Taft Parasitological Collection accessed via https://github.com/globalbioticinteractions/uwsp-para/archive/f9d0d52cd671731c7f002325e84187979bca4a5b.zip on 2022-06-24T16:14:04.745Z<br /> - Giraldo-Calderón, G. I., Emrich, S. J., MacCallum, R. M., Maslen, G., Dialynas, E., Topalis, P., … Lawson, D. (2015). VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic acids research, 43(Database issue), D707–D713. doi:10.1093/nar/gku1117. accessed via https://github.com/globalbioticinteractions/vectorbase/archive/00d6285cd4e9f4edd18cb2778624ab31b34b23b8.zip on 2022-06-24T16:14:11.965Z<br /> - WIRC / University of Wisconsin Madison WIS-IH / Wisconsin Insect Research Collection accessed via https://github.com/globalbioticinteractions/wis-ih-wirc/archive/34162b86c0ade4b493471543231ae017cc84816e.zip on 2022-06-24T16:14:29.743Z<br /> - Yale University Peabody Museum Collections Data Portal accessed via https://github.com/globalbioticinteractions/yale-peabody/archive/43be869f17749d71d26fc820c8bd931d6149fe8e.zip on 2022-06-24T16:23:29.289Z</p> <p>Generated on:<br /> 2022-06-24</p> <p>by:<br /> GloBI's Elton 0.12.4 <br /> (see https://github.com/globalbioticinteractions/elton).</p> <p>Note that all files ending with .tsv are files formatted <br /> as UTF8 encoded tab-separated values files.</p> <p>https://www.iana.org/assignments/media-types/text/tab-separated-values</p> <p><br /> Included in this review archive are:</p> <p>README:<br /> This file.</p> <p>review_summary.tsv:<br /> Summary across all reviewed collections of total number of distinct review comments.</p> <p>review_summary_by_collection.tsv:<br /> Summary by reviewed collection of total number of distinct review comments.</p> <p>indexed_interactions_by_collection.tsv: <br /> Summary of number of indexed interaction records by institutionCode and collectionCode.</p> <p>review_comments.tsv.gz:<br /> All review comments by collection.</p> <p>indexed_interactions_full.tsv.gz:<br /> All indexed interactions for all reviewed collections.</p> <p>indexed_interactions_simple.tsv.gz:<br /> All indexed interactions for all reviewed collections selecting only sourceInstitutionCode, sourceCollectionCode, sourceCatalogNumber, sourceTaxonName, interactionTypeName and targetTaxonName.</p> <p>datasets_under_review.tsv:<br /> Details on the datasets under review.</p> <p>elton.jar: <br /> Program used to update datasets and generate the review reports and associated indexed interactions.</p> <p>datasets.zip:<br /> Source datasets used by elton.jar in process of executing the generate_report.sh script.</p> <p>generate_report.sh:<br /> Program used to generate the report</p> <p>generate_report.log:<br /> Log file generated as part of running the generate_report.sh script<br /> </p> More>>
Cheadle Center for Biodiversity and Ecological Restoration, University of(
)
Abstract
<p>A biodiversity dataset graph: UCSB-IZC</p> <p>The intended use of this archive is to facilitate (meta-)analysis of the UC Santa Barbara Invertebrate Zoology Collection (UCSB-IZC). UCSB-IZC is a natural history collection of invertebrate zoology at Cheadle Center of Biodiversity and Ecological Restoration, University of California Santa Barbara.</p> <p>This dataset provides versioned snapshots of the UCSB-IZC network as tracked by Preston [2,3] between 2021-10-08 and 2021-11-04 using [preston track "https://api.gbif.org/v1/occurrence/search/?datasetKey=d6097f75-f99e-4c2a-b8a5-b0fc213ecbd0"].</p> <p>This archive contains 14349 images related to 32533 occurrence/specimen records. See included sample-image.jpg and their associated meta-data sample-image.json [4].</p> <p>The images were counted using:</p> <p>$ preston cat hash://sha256/80c0f5fc598be1446d23c95141e87880c9e53773cb2e0b5b54cb57a8ea00b20c\<br /> | grep -o -P ".*depict"\<br /> | sort\<br /> | uniq\<br /> | wc -l</p> <p>And the occurrences were counted using:</p> <p>$ preston cat hash://sha256/80c0f5fc598be1446d23c95141e87880c9e53773cb2e0b5b54cb57a8ea00b20c\<br /> | grep -o -P "occurrence/([0-9])+"\<br /> | sort\<br /> | uniq\<br /> | wc -l</p> <p>The archive consists of 256 individual parts (e.g., preston-00.tar.gz, preston-01.tar.gz, ...) to allow for parallel file downloads. The archive contains three types of files: index files, provenance files and data files. Only two index and provenance files are included and have been individually included in this dataset publication. Index files provide a way to links provenance files in time to establish
a versioning mechanism.</p> <p>To retrieve and verify the downloaded UCSB-IZC biodiversity dataset graph, first download preston-*.tar.gz. Then, extract the archives into a "data" folder. Alternatively, you can use the Preston [2,3] command-line tool to "clone" this dataset using:</p> <p>$ java -jar preston.jar clone --remote https://archive.org/download/preston-ucsb-izc/data.zip/,https://zenodo.org/record/5557670/files,https://zenodo.org/record/5557670/files/5660088</p> <p>After that, verify the index of the archive by reproducing the following provenance log history:</p> <p>$ java -jar preston.jar history<br /> <urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/d5eb492d3e0304afadcc85f968de1e23042479ad670a5819cee00f2c2c277f36> .<br /> <hash://sha256/80c0f5fc598be1446d23c95141e87880c9e53773cb2e0b5b54cb57a8ea00b20c> <http://purl.org/pav/previousVersion> <hash://sha256/d5eb492d3e0304afadcc85f968de1e23042479ad670a5819cee00f2c2c277f36> .</p> <p>To check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.</p> <p>$ java -jar preston.jar verify<br /> hash://sha256/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c file:/home/jhpoelen/ucsb-izc/data/ce/1d/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c OK CONTENT_PRESENT_VALID_HASH 66438 hash://sha256/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c<br /> hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 file:/home/jhpoelen/ucsb-izc/data/f6/8d/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 OK CONTENT_PRESENT_VALID_HASH 4093 hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844<br /> hash://sha256/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef file:/home/jhpoelen/ucsb-izc/data/3e/70/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef OK CONTENT_PRESENT_VALID_HASH 5746 hash://sha256/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef<br /> hash://sha256/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b file:/home/jhpoelen/ucsb-izc/data/99/58/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b OK CONTENT_PRESENT_VALID_HASH 6147 hash://sha256/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b</p> <p>Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".</p> <p>Files in this data publication:</p> <p>--- start of file descriptions ---</p> <p>-- description of archive and its contents (this file) --<br /> README</p> <p>-- executable java jar containing preston [2,3] v0.3.1. --<br /> preston.jar</p> <p>-- preston archive containing UCSB-IZC (meta-)data/image files, associated provenance logs and a provenance index --<br /> preston-[00-ff].tar.gz</p> <p>-- individual provenance index files --<br /> 2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a</p> <p>-- example image and meta-data --<br /> sample-image.jpg (with hash://sha256/916ba5dc6ad37a3c16634e1a0e3d2a09969f2527bb207220e3dbdbcf4d6b810c)<br /> sample-image.json (with hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844)</p> <p>--- end of file descriptions ---</p> <p><br /> References</p> <p>[1] Cheadle Center for Biodiversity and Ecological Restoration (2021). University of California Santa Barbara Invertebrate Zoology Collection. Occurrence dataset https://doi.org/10.15468/w6hvhv accessed via GBIF.org on 2021-11-04 as indexed by the Global Biodiversity Informatics Facility (GBIF) with provenance hash://sha256/d5eb492d3e0304afadcc85f968de1e23042479ad670a5819cee00f2c2c277f36 hash://sha256/80c0f5fc598be1446d23c95141e87880c9e53773cb2e0b5b54cb57a8ea00b20c.<br /> [2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 .<br /> [3] MJ Elliott, JH Poelen, JAB Fortes (2020). Toward Reliable Biodiversity Dataset References. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2020.101132<br /> [4] Cheadle Center for Biodiversity and Ecological Restoration (2021). University of California Santa Barbara Invertebrate Zoology Collection. Occurrence dataset https://doi.org/10.15468/w6hvhv accessed via GBIF.org on 2021-10-08. https://www.gbif.org/occurrence/3323647301 . hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 hash://sha256/916ba5dc6ad37a3c16634e1a0e3d2a09969f2527bb207220e3dbdbcf4d6b810c</p>
Other
This work is funded in part by grant NSF OAC 1839201 and NSF DBI 2102006 from the National Science Foundation. More>>
Cheadle Center for Biodiversity and Ecological Restoration, University of(
)
Abstract
<p>A biodiversity dataset graph: UCSB-IZC</p> <p>The intended use of this archive is to facilitate (meta-)analysis of the UC Santa Barbara Invertebrate Zoology Collection (UCSB-IZC). UCSB-IZC is a natural history collection of invertebrate zoology at Cheadle Center of Biodiversity and Ecological Restoration, University of California Santa Barbara.</p> <p>This dataset provides versioned snapshots of the UCSB-IZC network as tracked by Preston [2,3] on 2021-10-08 using [preston track "https://api.gbif.org/v1/occurrence/search/?datasetKey=d6097f75-f99e-4c2a-b8a5-b0fc213ecbd0"].</p> <p>This archive contains 14137 images related to 33730 occurrence/specimen records. See included sample-image.jpg and their associated meta-data sample-image.json [4].</p> <p>The archive consists of 256 individual parts (e.g., preston-00.tar.gz, preston-01.tar.gz, ...) to allow for parallel file downloads. The archive contains three types of files: index files, provenance files and data files. Only two index and provenance files are included and have been individually included in this dataset publication. Index files provide a way to links provenance files in time to establish a versioning mechanism.</p> <p>To retrieve and verify the downloaded UCSB-IZC biodiversity dataset graph, first download preston-*.tar.gz. Then, extract the archives into a "data" folder. Alternatively, you can use the Preston [2,3] command-line tool to "clone" this dataset using:</p> <p>$ java -jar preston.jar clone --remote https://archive.org/download/preston-ucsb-izc/data.zip/,https://zenodo.org/record/5557670/files</p> <p>After that, verify the index of the archive
by reproducing the following provenance log history:</p> <p>$ java -jar preston.jar history<br /> <urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/d5eb492d3e0304afadcc85f968de1e23042479ad670a5819cee00f2c2c277f36> .</p> <p>To check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.</p> <p>$ java -jar preston.jar verify<br /> hash://sha256/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c file:/home/jhpoelen/ucsb-izc/data/ce/1d/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c OK CONTENT_PRESENT_VALID_HASH 66438 hash://sha256/ce1dc2468dfb1706a6f972f11b5489dc635bdcf9c9fd62a942af14898c488b2c<br /> hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 file:/home/jhpoelen/ucsb-izc/data/f6/8d/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 OK CONTENT_PRESENT_VALID_HASH 4093 hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844<br /> hash://sha256/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef file:/home/jhpoelen/ucsb-izc/data/3e/70/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef OK CONTENT_PRESENT_VALID_HASH 5746 hash://sha256/3e70b7adc1a342e5551b598d732c20b96a0102bb1e7f42cfc2ae8a2c4227edef<br /> hash://sha256/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b file:/home/jhpoelen/ucsb-izc/data/99/58/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b OK CONTENT_PRESENT_VALID_HASH 6147 hash://sha256/995806159ae2fdffdc35eef2a7eccf362cb663522c308aa6aa52e2faca8bb25b</p> <p>Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".</p> <p>Files in this data publication:</p> <p>--- start of file descriptions ---</p> <p>-- description of archive and its contents (this file) --<br /> README</p> <p>-- executable java jar containing preston [2,3] v0.3.1. --<br /> preston.jar</p> <p>-- preston archive containing UCSB-IZC (meta-)data/image files, associated provenance logs and a provenance index --<br /> preston-[00-ff].tar.gz</p> <p>-- individual provenance index files --<br /> 2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a</p> <p>-- example image and meta-data --<br /> sample-image.jpg (with hash://sha256/916ba5dc6ad37a3c16634e1a0e3d2a09969f2527bb207220e3dbdbcf4d6b810c)<br /> sample-image.json (with hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844)</p> <p>--- end of file descriptions ---</p> <p><br /> References</p> <p>[1] Cheadle Center for Biodiversity and Ecological Restoration (2021). University of California Santa Barbara Invertebrate Zoology Collection. Occurrence dataset https://doi.org/10.15468/w6hvhv accessed via GBIF.org on 2021-10-08 as indexed by the Global Biodiversity Informatics Facility (GBIF) with provenance hash://sha256/d5eb492d3e0304afadcc85f968de1e23042479ad670a5819cee00f2c2c277f36.<br /> [2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 .<br /> [3] MJ Elliott, JH Poelen, JAB Fortes (2020). Toward Reliable Biodiversity Dataset References. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2020.101132<br /> [4] Cheadle Center for Biodiversity and Ecological Restoration (2021). University of California Santa Barbara Invertebrate Zoology Collection. Occurrence dataset https://doi.org/10.15468/w6hvhv accessed via GBIF.org on 2021-10-08. https://www.gbif.org/occurrence/3323647301 . hash://sha256/f68d489a9275cb9d1249767244b594c09ab23fd00b82374cb5877cabaa4d0844 hash://sha256/916ba5dc6ad37a3c16634e1a0e3d2a09969f2527bb207220e3dbdbcf4d6b810c</p>
Other
This work is funded in part by grant NSF OAC 1839201 and NSF DBI 2102006 from the National Science Foundation. More>>
Seltmann, Katja; Poelen, Jorrit; Sullivan, Kathryn; Zaspel, Jennifer(
, Biodiversity Information Science and Standards)
A wealth of information about how parasites interact with their hosts already exists in collections, scientific publications, specialized databases, and grey literature. The US National Science Foundation-funded Terrestrial Parasite Tracker Thematic Collection Network (TPT) project began in 2019 to help build a comprehensive picture of arthropod ectoparasites including the evolution of these parasite-host biotic associations, distributions, and the ecological interactions of disease vectors. TPT is a network of biodiversity collections whose data can assist scientists, educators, land managers, and policymakers to better understand the complex relationship between hosts and parasites including emergent properties that may explain the causes and frequency of human and wildlife pathogens. TPT member collections make their association information easier to access via Global Biotic Interactions (GloBI, Poelen et al. 2014), which is periodically archived through Zenodo to track progress in the TPT project. TPT leverages GloBI's ability to index biotic associations from specimen occurrence records that come from existing management systems (e.g., Arctos, Symbiota, EMu, Excel, MS Access) to avoid having to completely rework existing, or build new, cyber-infrastructures before collections can share data. TPT-affiliated collection managers use collection-specific translation tables to connect their verbatim (or original) terms used to describe associations (e.g., "ex", "found on",more »"host") to their interpreted, machine-readable terms in the OBO Relations Ontology (RO). These interpreted terms enable searches across previously siloed association record sets, while the original verbatim values remain accessible to help retain provenance and allow for interpretation improvements. TPT is an ambitious project, with the goal to database label data from over 1.2 million specimens of arthropod parasites of vertebrates coming from 22 collections across North America. In the first year of the project, the TPT collections created over 73,700 new records and 41,984 images. In addition, 17 TPT data providers and three other collaborators shared datasets that are now indexed by GloBI, visible on the TPT GloBI project page. These datasets came from collection specimen occurrence records and literature sources. Two TPT data archives that capture and preserve the changes in the data coming from TPT to GloBI were published through Zenodo (Poelen et al. 2020a, Poelen et al. 2020b). The archives document the changes in how data are shared by collections including the biotic association data format and quantity of data captured. The Poelen et al. 2020b report included all TPT collections and biotic interactions from Arctos collections in VertNet and the Symbiota Collection of Arthropods Network (SCAN). The total number of interactions included in this report was 376,671 records (500,000 interactions is the overall goal for TPT). In addition, close coordination with TPT collection data managers including many one-on-one conversations, a workshop, and a webinar (Sullivan et al. 2020) was conducted to help guide the data capture of biotic associations. GloBI is an effective tool to help integrate biotic association data coming from occurrence records into an openly accessible, global, linked view of existing species interaction records. The results gleaned from the TPT workshop and Zenodo data archives demonstrate that minimizing changes to existing workflows allow for custom interpretation of collection-specific interaction terms. In addition, including collection data managers in the development of the interaction term vocabularies is an important part of the process that may improve data sharing and the overall downstream data quality.« less
Salim, José Augusto; Seltmann, Katja; Poelen, Jorrit; Saraiva, Antonio(
, Biodiversity Information Science and Standards)
The Global Biodiversity Information Facility (GBIF 2022a) has indexed more than 2 billion occurrence records from 70,147 datasets. These datasets often include "hidden" biotic interaction data because biodiversity communities use the Darwin Core standard (DwC, Wieczorek et al. 2012) in different ways to document biotic interactions. In this study, we extracted biotic interactions from GBIF data using an approach similar to that employed in the Global Biotic Interactions (GloBI; Poelen et al. 2014) and summarized the results. Here we aim to present an estimation of the interaction data available in GBIF, showing that biotic interaction claims can be automatically found and extracted from GBIF. Our results suggest that much can be gained by an increased focus on development of tools that help to index and curate biotic interaction data in existing datasets. Combined with data standardization and best practices for sharing biotic interactions, such as the initiative on plant-pollinators interaction (Salim 2022), this approach can rapidly contribute to and meet open data principles (Wilkinson 2016). We used Preston (Elliott et al. 2020), open-source software that versions biodiversity datasets, to copy all GBIF-indexed datasets. The biodiversity data graph version (Poelen 2020) of the GBIF-indexed datasets used during this study contains 58,504more »datasets in Darwin Core Archive (DwC-A) format, totaling 574,715,196 records. After retrieval and verification, the datasets were processed using Elton. Elton extracts biotic interaction data and supports 20+ existing file formats, including various types of data elements in DwC records. Elton also helps align interaction claims (e.g., host of, parasite of, associated with) to the Relations Ontology (RO, Mungall 2022), making it easier to discover datasets across a heterogeneous collection of datasets. Using specific mapping between interaction claims found in the DwC records to the terms in RO*1, Elton found 30,167,984 potential records (with non-empty values for the scanned DwC terms) and 15,248,478 records with recognized interaction types. Taxonomic name validation was performed using Nomer, which maps input names to names found in a variety of taxonomic catalogs. We only considered an interaction record valid where the interaction type could be mapped to a term in RO and where Nomer found a valid name for source and target taxa. Based on the workflow described in Fig. 1, we found 7,947,822 interaction records (52% of the potential interactions). Most of them were generic interactions ( interacts_ with , 87.5%), but the remaining 12.5% (993,477 records) included host-parasite and plant-animal interactions. The majority of the interactions records found involved plants (78%), animals (14%) and fungi (6%). In conclusion, there are many biotic interactions embedded in existing datasets registered in large biodiversity data indexers and aggregators like iDigBio, GBIF, and BioCASE. We exposed these biotic interaction claims using the combined functionality of biodiversity data tools Elton (for interaction data extraction), Preston (for reliable dataset tracking) and Nomer (for taxonomic name alignment). Nonetheless, the development of new vocabularies, standards and best practice guides would facilitate aggregation of interaction data, including the diversification of the GBIF data model (GBIF 2022b) for sharing biodiversity data beyond occurrences data. That is the aim of the TDWG Interest Group on Biological Interactions Data (TDWG 2022).« less
Poelen, Jorrit H., Seltmann, Katja C., Campbell, Mariel, Orlofske, Sarah A., Light, Jessica E., Tucker, Erika M., Demboski, John R, McElrath, Tommy, Grinter, Christopher C, Diaz-Bastin, Rachel, Bush, Sarah E, Delapena, Robin, Cook, Joseph, Gall, Lawrence F., Whiting, Michael F, Clark, Shawn M, Cameron, Stephen L, Replogle, Charla R, Rund, Samuel S.C., Young, Daniel, Brabant, Craig, Sullivan, Kathryn, Turcatel, Maureen, Shuman Baquiran, Rebekah, Albion, Zoe, Austin, Kyhl, Rubinoff, Dan, Cognato, Anthony I., Caywood, Alyssa, Colby, Julia, Allen, Julie, Zaspel, Jennifer M., and Bailey, Colin. Terrestrial Parasite Tracker indexed biotic interactions and review summary. Web. doi:10.5281/zenodo.6761707.
Poelen, Jorrit H., Seltmann, Katja C., Campbell, Mariel, Orlofske, Sarah A., Light, Jessica E., Tucker, Erika M., Demboski, John R, McElrath, Tommy, Grinter, Christopher C, Diaz-Bastin, Rachel, Bush, Sarah E, Delapena, Robin, Cook, Joseph, Gall, Lawrence F., Whiting, Michael F, Clark, Shawn M, Cameron, Stephen L, Replogle, Charla R, Rund, Samuel S.C., Young, Daniel, Brabant, Craig, Sullivan, Kathryn, Turcatel, Maureen, Shuman Baquiran, Rebekah, Albion, Zoe, Austin, Kyhl, Rubinoff, Dan, Cognato, Anthony I., Caywood, Alyssa, Colby, Julia, Allen, Julie, Zaspel, Jennifer M., & Bailey, Colin. Terrestrial Parasite Tracker indexed biotic interactions and review summary. https://doi.org/10.5281/zenodo.6761707
Poelen, Jorrit H., Seltmann, Katja C., Campbell, Mariel, Orlofske, Sarah A., Light, Jessica E., Tucker, Erika M., Demboski, John R, McElrath, Tommy, Grinter, Christopher C, Diaz-Bastin, Rachel, Bush, Sarah E, Delapena, Robin, Cook, Joseph, Gall, Lawrence F., Whiting, Michael F, Clark, Shawn M, Cameron, Stephen L, Replogle, Charla R, Rund, Samuel S.C., Young, Daniel, Brabant, Craig, Sullivan, Kathryn, Turcatel, Maureen, Shuman Baquiran, Rebekah, Albion, Zoe, Austin, Kyhl, Rubinoff, Dan, Cognato, Anthony I., Caywood, Alyssa, Colby, Julia, Allen, Julie, Zaspel, Jennifer M., and Bailey, Colin.
"Terrestrial Parasite Tracker indexed biotic interactions and review summary". Country unknown/Code not available: Zenodo. https://doi.org/10.5281/zenodo.6761707.https://par.nsf.gov/biblio/10353060.
@article{osti_10353060,
place = {Country unknown/Code not available},
title = {Terrestrial Parasite Tracker indexed biotic interactions and review summary},
url = {https://par.nsf.gov/biblio/10353060},
DOI = {10.5281/zenodo.6761707},
abstractNote = {{"Abstract":["PLEASE CONTACT AUTHORS IF YOU CONTRIBUTE AND WOULD LIKE TO BE LISTED AS A CO-AUTHOR. (this message will be removed some time weeks/months after the first publication)<\/p>\n\nTerrestrial Parasite Tracker indexed biotic interactions and review summary.<\/p>\n\nThe Terrestrial Parasite Tracker (TPT) project began in 2019 and is funded by the National Science foundation to mobilize data from vector and ectoparasite collections to data aggregators (e.g., iDigBio, GBIF) to help build a comprehensive picture of arthropod host-association evolution, distributions, and the ecological interactions of disease vectors which will assist scientists, educators, land managers, and policy makers. Arthropod parasites often are important to human and wildlife health and safety as vectors of pathogens, and it is critical to digitize these specimens so that they, and their biotic interaction data, will be available to help understand and predict the spread of human and wildlife disease.<\/p>\n\nThis data publication contains versioned TPT associated datasets and related data products that were tracked, reviewed and indexed by Global Biotic Interactions (GloBI) and associated tools. GloBI provides open access to finding species interaction data (e.g., predator-prey, pollinator-plant, pathogen-host, parasite-host) by combining existing open datasets using open source software.<\/p>\n\nIf you have questions or comments about this publication, please open an issue at https://github.com/ParasiteTracker/tpt-reporting or contact the authors by email.<\/p>\n\nFunding:\nThe creation of this archive was made possible by the National Science Foundation award "Collaborative Research: Digitization TCN: Digitizing collections to trace parasite-host associations and predict the spread of vector-borne disease," Award numbers DBI:1901932 and DBI:1901926<\/p>\n\nReferences:\nJorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2014.08.005.<\/p>\n\nGloBI Data Review Report<\/p>\n\nDatasets under review:\n - University of Michigan Museum of Zoology Insect Division. Full Database Export 2020-11-20 provided by Erika Tucker and Barry Oconner. accessed via https://github.com/EMTuckerLabUMMZ/ummzi/archive/6731357a377e9c2748fc931faa2ff3dc0ce3ea7a.zip on 2022-06-24T14:02:48.801Z\n - Academy of Natural Sciences Entomology Collection for the Parasite Tracker Project accessed via https://github.com/globalbioticinteractions/ansp-para/archive/5e6592ad09ec89ba7958266ad71ec9d5d21d1a44.zip on 2022-06-24T14:04:22.091Z\n - Bernice Pauahi Bishop Museum, J. Linsley Gressitt Center for Research in Entomology accessed via https://github.com/globalbioticinteractions/bpbm-ent/archive/c085398dddd36f8a1169b9cf57de2a572229341b.zip on 2022-06-24T14:04:37.692Z\n - Texas A&M University, Biodiversity Teaching and Research Collections accessed via https://github.com/globalbioticinteractions/brtc-para/archive/f0a718145b05ed484c4d88947ff712d5f6395446.zip on 2022-06-24T14:06:40.154Z\n - Brigham Young University Arthropod Museum accessed via https://github.com/globalbioticinteractions/byu-byuc/archive/4a609ac6a9a03425e2720b6cdebca6438488f029.zip on 2022-06-24T14:06:51.420Z\n - California Academy of Sciences Entomology accessed via https://github.com/globalbioticinteractions/cas-ent/archive/562aea232ec74ab615f771239451e57b057dc7c0.zip on 2022-06-24T14:07:16.371Z\n - Clemson University Arthropod Collection accessed via https://github.com/globalbioticinteractions/cu-cuac/archive/6cdcbbaa4f7cec8e1eac705be3a999bc5259e00f.zip on 2022-06-24T14:07:40.925Z\n - Denver Museum of Nature and Science (DMNS) Parasite specimens (DMNS:Para) accessed via https://github.com/globalbioticinteractions/dmns-para/archive/a037beb816226eb8196533489ee5f98a6dfda452.zip on 2022-06-24T14:08:00.730Z\n - Field Museum of Natural History IPT accessed via https://github.com/globalbioticinteractions/fmnh/archive/6bfc1b7e46140e93f5561c4e837826204adb3c2f.zip on 2022-06-24T14:18:51.995Z\n - Illinois Natural History Survey Insect Collection accessed via https://github.com/globalbioticinteractions/inhs-insects/archive/38692496f590577074c7cecf8ea37f85d0594ae1.zip on 2022-06-24T14:19:37.563Z\n - UMSP / University of Minnesota / University of Minnesota Insect Collection accessed via https://github.com/globalbioticinteractions/min-umsp/archive/3f1b9d32f947dcb80b9aaab50523e097f0e8776e.zip on 2022-06-24T14:20:27.232Z\n - Milwaukee Public Museum Biological Collections Data Portal accessed via https://github.com/globalbioticinteractions/mpm/archive/9f44e99c49ec5aba3f8592cfced07c38d3223dcd.zip on 2022-06-24T14:20:46.185Z\n - Museum for Southern Biology (MSB) Parasite Collection accessed via https://github.com/globalbioticinteractions/msb-para/archive/178a0b7aa0a8e14b3fe953e770703fe331eadacc.zip on 2022-06-24T15:16:07.223Z\n - The Albert J. Cook Arthropod Research Collection accessed via https://github.com/globalbioticinteractions/msu-msuc/archive/38960906380443bd8108c9e44aeff4590d8d0b50.zip on 2022-06-24T16:09:40.702Z\n - Ohio State University Acarology Laboratory accessed via https://github.com/globalbioticinteractions/osal-ar/archive/876269d66a6a94175dbb6b9a604897f8032b93dd.zip on 2022-06-24T16:10:00.281Z\n - Frost Entomological Museum, Pennsylvania State University accessed via https://github.com/globalbioticinteractions/psuc-ento/archive/30b1f96619a6e9f10da18b42fb93ff22cc4f72e2.zip on 2022-06-24T16:10:07.741Z\n - Purdue Entomological Research Collection accessed via https://github.com/globalbioticinteractions/pu-perc/archive/e0909a7ca0a8df5effccb288ba64b28141e388ba.zip on 2022-06-24T16:10:26.654Z\n - Texas A&M University Insect Collection accessed via https://github.com/globalbioticinteractions/tamuic-ent/archive/f261a8c192021408da67c39626a4aac56e3bac41.zip on 2022-06-24T16:10:58.496Z\n - University of California Santa Barbara Invertebrate Zoology Collection accessed via https://github.com/globalbioticinteractions/ucsb-izc/archive/825678ad02df93f6d4469f9d8b7cc30151b9aa45.zip on 2022-06-24T16:12:29.854Z\n - University of Hawaii Insect Museum accessed via https://github.com/globalbioticinteractions/uhim/archive/53fa790309e48f25685e41ded78ce6a51bafde76.zip on 2022-06-24T16:12:41.408Z\n - University of New Hampshire Collection of Insects and other Arthropods UNHC-UNHC accessed via https://github.com/globalbioticinteractions/unhc/archive/f72575a72edda8a4e6126de79b4681b25593d434.zip on 2022-06-24T16:12:59.500Z\n - Scott L. Gardner and Gabor R. Racz (2021). University of Nebraska State Museum - Parasitology. Harold W. Manter Laboratory of Parasitology. University of Nebraska State Museum. accessed via https://github.com/globalbioticinteractions/unl-nsm/archive/6bcd8aec22e4309b7f4e8be1afe8191d391e73c6.zip on 2022-06-24T16:13:06.914Z\n - Data were obtained from specimens belonging to the United States National Museum of Natural History (USNM), Smithsonian Institution, Washington DC and digitized by the Walter Reed Biosystematics Unit (WRBU). accessed via https://github.com/globalbioticinteractions/usnmentflea/archive/ce5cb1ed2bbc13ee10062b6f75a158fd465ce9bb.zip on 2022-06-24T16:13:38.013Z\n - US National Museum of Natural History Ixodes Records accessed via https://github.com/globalbioticinteractions/usnm-ixodes/archive/c5fcd5f34ce412002783544afb628a33db7f47a6.zip on 2022-06-24T16:13:45.666Z\n - Price Institute of Parasite Research, School of Biological Sciences, University of Utah accessed via https://github.com/globalbioticinteractions/utah-piper/archive/43da8db550b5776c1e3d17803831c696fe9b8285.zip on 2022-06-24T16:13:54.724Z\n - University of Wisconsin Stevens Point, Stephen J. Taft Parasitological Collection accessed via https://github.com/globalbioticinteractions/uwsp-para/archive/f9d0d52cd671731c7f002325e84187979bca4a5b.zip on 2022-06-24T16:14:04.745Z\n - Giraldo-Calderón, G. I., Emrich, S. J., MacCallum, R. M., Maslen, G., Dialynas, E., Topalis, P., \u2026 Lawson, D. (2015). VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic acids research, 43(Database issue), D707\u2013D713. doi:10.1093/nar/gku1117. accessed via https://github.com/globalbioticinteractions/vectorbase/archive/00d6285cd4e9f4edd18cb2778624ab31b34b23b8.zip on 2022-06-24T16:14:11.965Z\n - WIRC / University of Wisconsin Madison WIS-IH / Wisconsin Insect Research Collection accessed via https://github.com/globalbioticinteractions/wis-ih-wirc/archive/34162b86c0ade4b493471543231ae017cc84816e.zip on 2022-06-24T16:14:29.743Z\n - Yale University Peabody Museum Collections Data Portal accessed via https://github.com/globalbioticinteractions/yale-peabody/archive/43be869f17749d71d26fc820c8bd931d6149fe8e.zip on 2022-06-24T16:23:29.289Z<\/p>\n\nGenerated on:\n2022-06-24<\/p>\n\nby:\nGloBI's Elton 0.12.4 \n(see https://github.com/globalbioticinteractions/elton).<\/p>\n\nNote that all files ending with .tsv are files formatted \nas UTF8 encoded tab-separated values files.<\/p>\n\nhttps://www.iana.org/assignments/media-types/text/tab-separated-values<\/p>\n\n\nIncluded in this review archive are:<\/p>\n\nREADME:\n This file.<\/p>\n\nreview_summary.tsv:\n Summary across all reviewed collections of total number of distinct review comments.<\/p>\n\nreview_summary_by_collection.tsv:\n Summary by reviewed collection of total number of distinct review comments.<\/p>\n\nindexed_interactions_by_collection.tsv: \n Summary of number of indexed interaction records by institutionCode and collectionCode.<\/p>\n\nreview_comments.tsv.gz:\n All review comments by collection.<\/p>\n\nindexed_interactions_full.tsv.gz:\n All indexed interactions for all reviewed collections.<\/p>\n\nindexed_interactions_simple.tsv.gz:\n All indexed interactions for all reviewed collections selecting only sourceInstitutionCode, sourceCollectionCode, sourceCatalogNumber, sourceTaxonName, interactionTypeName and targetTaxonName.<\/p>\n\ndatasets_under_review.tsv:\n Details on the datasets under review.<\/p>\n\nelton.jar: \n Program used to update datasets and generate the review reports and associated indexed interactions.<\/p>\n\ndatasets.zip:\n Source datasets used by elton.jar in process of executing the generate_report.sh script.<\/p>\n\ngenerate_report.sh:\n Program used to generate the report<\/p>\n\ngenerate_report.log:\n Log file generated as part of running the generate_report.sh script\n <\/p>"]}},
journal = {},
publisher = {Zenodo},
author = {Poelen, Jorrit H. and Seltmann, Katja C. and Campbell, Mariel and Orlofske, Sarah A. and Light, Jessica E. and Tucker, Erika M. and Demboski, John R and McElrath, Tommy and Grinter, Christopher C and Diaz-Bastin, Rachel and Bush, Sarah E and Delapena, Robin and Cook, Joseph and Gall, Lawrence F. and Whiting, Michael F and Clark, Shawn M and Cameron, Stephen L and Replogle, Charla R and Rund, Samuel S.C. and Young, Daniel and Brabant, Craig and Sullivan, Kathryn and Turcatel, Maureen and Shuman Baquiran, Rebekah and Albion, Zoe and Austin, Kyhl and Rubinoff, Dan and Cognato, Anthony I. and Caywood, Alyssa and Colby, Julia and Allen, Julie and Zaspel, Jennifer M. and Bailey, Colin},
}