A biodiversity dataset graph: GBIF, iDigBio, BioCASe
The intended use of this archive is to facilitate meta-analysis of the Global Biodiversity Information Facility, Integrated Digitized Biocollections, Biological Collection Access Service (GBIF, iDigBio, BioCASe). GBIF, iDigBio and BioCASe help provide access to biological data collections.
This dataset provides versioned provenance logs of snapshots of the GBIF, iDigBio, BioCASe network as tracked by Preston [2] between 2018-09-03 and 2019-10-02 using "preston update -u https://gbif.org,https://idigbio.org,http://biocase.org".
This publication contains two types of files: index files and provenance logs. Associated data files are hosted elsewhere for pragmatic reasons. Index files provide a way to link provenance files in time to establish a versioning mechanism. Provenance logs describe how, when, what and where the GBIF, iDigBio, BioCASe content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .
To retrieve and verify the downloaded GBIF, iDigBio, BioCASe biodiversity dataset graph, use the preston[2] command-line tool to "clone" this dataset using:
$ java -jar preston.jar ls --remote https://zenodo.org/record/3484205/files > /dev/null
Optionally, you can retrieve all associated data (~500GB) files using:
$ java -jar preston.jar clone --remote https://zenodo.org/record/3484205/files,https://deeplinker.bio
Please note https://deeplinker.bio is a Preston remote that provided access to GBIF, iDigBio, BioCASe data files at time of writing (13 Oct 2019). This remote can replaced with any other Preston remote(s) if needed. This may take a while depending on network speed and hardware constraints.
After that, verify the index of the archive by reproducing the following provenance log history:
$ java -jar preston.jar history
<0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/c253a5311a20c2fc082bf9bac87a1ec5eb6e4e51ff936e7be20c29c8e77dee55> .
<hash://sha256/b83cf099449dae3f633af618b19d05013953e7a1d7d97bc5ac01afd7bd9abe5d> <http://purl.org/pav/previousVersion> <hash://sha256/c253a5311a20c2fc082bf9bac87a1ec5eb6e4e51ff936e7be20c29c8e77dee55> .
<hash://sha256/7efdea9263e57605d2d2d8b79ccd26a55743123d0c974140c72c8c1cfc679b93> <http://purl.org/pav/previousVersion> <hash://sha256/b83cf099449dae3f633af618b19d05013953e7a1d7d97bc5ac01afd7bd9abe5d> .
<hash://sha256/05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd> <http://purl.org/pav/previousVersion> <hash://sha256/7efdea9263e57605d2d2d8b79ccd26a55743123d0c974140c72c8c1cfc679b93> .
<hash://sha256/b5a30bbd8d51e9faf08d4ddebbc5bda9bab1b12545172f1524ac5ebdb0038bd4> <http://purl.org/pav/previousVersion> <hash://sha256/05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd> .
<hash://sha256/1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e> <http://purl.org/pav/previousVersion> <hash://sha256/b5a30bbd8d51e9faf08d4ddebbc5bda9bab1b12545172f1524ac5ebdb0038bd4> .
<hash://sha256/24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43> <http://purl.org/pav/previousVersion> <hash://sha256/1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e> .
<hash://sha256/ba02b235fd445904eae45b50bc637a195f25e9ca1637bcf26b2dc7f8698aa1fe> <http://purl.org/pav/previousVersion> <hash://sha256/24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43> .
<hash://sha256/102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9> <http://purl.org/pav/previousVersion> <hash://sha256/ba02b235fd445904eae45b50bc637a195f25e9ca1637bcf26b2dc7f8698aa1fe> .
<hash://sha256/fd27b0552c8a6800a8b3b1b822a2063a3215c1d9887badad09a62746b80846bc> <http://purl.org/pav/previousVersion> <hash://sha256/102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9> .
<hash://sha256/20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17> <http://purl.org/pav/previousVersion> <hash://sha256/fd27b0552c8a6800a8b3b1b822a2063a3215c1d9887badad09a62746b80846bc> .
<hash://sha256/7801a034fe3c7920e032d2338a690b700ca41a90a92d878fc3a67111cad16d29> <http://purl.org/pav/previousVersion> <hash://sha256/20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17> .
<hash://sha256/c1b50502b1ca87046eeb7fe4863d0cf9319b6645ff2142db69f21b4cc23332b6> <http://purl.org/pav/previousVersion> <hash://sha256/7801a034fe3c7920e032d2338a690b700ca41a90a92d878fc3a67111cad16d29> .
<hash://sha256/dc293e26154b89273791b9674d81110029f987c686b386184d0b66a5b95f9cda> <http://purl.org/pav/previousVersion> <hash://sha256/c1b50502b1ca87046eeb7fe4863d0cf9319b6645ff2142db69f21b4cc23332b6> .
<hash://sha256/f3ed6aa1bd15ee43d05e138b935040aaa745f6ca8c7e8f2dfbb0a3ae0df66f36> <http://purl.org/pav/previousVersion> <hash://sha256/dc293e26154b89273791b9674d81110029f987c686b386184d0b66a5b95f9cda> .
<hash://sha256/650a28fff3e03dadba70dc05a34c580c04203380187953fa4a2fb778353fee79> <http://purl.org/pav/previousVersion> <hash://sha256/f3ed6aa1bd15ee43d05e138b935040aaa745f6ca8c7e8f2dfbb0a3ae0df66f36> .
<hash://sha256/e4e5736e8bfec6c686eedde4c6dfa62845930d04e12dfa6f8a7d70abc3d087df> <http://purl.org/pav/previousVersion> <hash://sha256/650a28fff3e03dadba70dc05a34c580c04203380187953fa4a2fb778353fee79> .
<hash://sha256/e69d186ff3be11830c2da67d1bfeb896ec6398fc9d555fa26eaae1baa54450fb> <http://purl.org/pav/previousVersion> <hash://sha256/e4e5736e8bfec6c686eedde4c6dfa62845930d04e12dfa6f8a7d70abc3d087df> .
<hash://sha256/3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee> <http://purl.org/pav/previousVersion> <hash://sha256/e69d186ff3be11830c2da67d1bfeb896ec6398fc9d555fa26eaae1baa54450fb> .
<hash://sha256/5c469224fa0b6159bf33a59ddaa0246634e81bddd1728e7bf3540745055eccfa> <http://purl.org/pav/previousVersion> <hash://sha256/3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee> .
<hash://sha256/eb2c716ec85158a0785216de1b09965173fc368d12f213c1bf747bbc2e49c6a6> <http://purl.org/pav/previousVersion> <hash://sha256/5c469224fa0b6159bf33a59ddaa0246634e81bddd1728e7bf3540745055eccfa> .
<hash://sha256/3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7> <http://purl.org/pav/previousVersion> <hash://sha256/eb2c716ec85158a0785216de1b09965173fc368d12f213c1bf747bbc2e49c6a6> .
<hash://sha256/480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6> <http://purl.org/pav/previousVersion> <hash://sha256/3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7> .
<hash://sha256/58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70> <http://purl.org/pav/previousVersion> <hash://sha256/480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6> .
<hash://sha256/a0a18b0e32f933112084b846863438038f66f63eeeb22fa9d8d734e8a25bb208> <http://purl.org/pav/previousVersion> <hash://sha256/58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70> .
<hash://sha256/a7a5e7c6a4b21bdf67f48d6bea85f438b8133f674027b04625dfadec3ff985f6> <http://purl.org/pav/previousVersion> <hash://sha256/a0a18b0e32f933112084b846863438038f66f63eeeb22fa9d8d734e8a25bb208> .
<hash://sha256/0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781> <http://purl.org/pav/previousVersion> <hash://sha256/a7a5e7c6a4b21bdf67f48d6bea85f438b8133f674027b04625dfadec3ff985f6> .
<hash://sha256/8c0752dc6425b9c716837c9713ce284158b4cff70a1e66be2beb0677018831f4> <http://purl.org/pav/previousVersion> <hash://sha256/0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781> .
<hash://sha256/d99fa37caa268f8061980001146ed2a566e814d0740bb1974b76847512be95d3> <http://purl.org/pav/previousVersion> <hash://sha256/8c0752dc6425b9c716837c9713ce284158b4cff70a1e66be2beb0677018831f4> .
<hash://sha256/af0bb2c89571a30815d4488e72dede84a2ffc102bb87961f06884509fd5d1dae> <http://purl.org/pav/previousVersion> <hash://sha256/d99fa37caa268f8061980001146ed2a566e814d0740bb1974b76847512be95d3> .
<hash://sha256/261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12> <http://purl.org/pav/previousVersion> <hash://sha256/af0bb2c89571a30815d4488e72dede84a2ffc102bb87961f06884509fd5d1dae> .
<hash://sha256/5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7> <http://purl.org/pav/previousVersion> <hash://sha256/261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12> .
<hash://sha256/af8f9ed321d9c403617f54a96e3217adc918970fbbfe8b8715359669f4890b63> <http://purl.org/pav/previousVersion> <hash://sha256/5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7> .
<hash://sha256/9a41d2583f0b8169ffdd44fb2d3a5e057eba4a10e5d9193d0c6e9dcf07c3119e> <http://purl.org/pav/previousVersion> <hash://sha256/af8f9ed321d9c403617f54a96e3217adc918970fbbfe8b8715359669f4890b63> .
<hash://sha256/b9864a749112cad2fe19e62bf5d8bad580a7036d363d16d81d5c16be325fa0fd> <http://purl.org/pav/previousVersion> <hash://sha256/9a41d2583f0b8169ffdd44fb2d3a5e057eba4a10e5d9193d0c6e9dcf07c3119e> .
<hash://sha256/09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2> <http://purl.org/pav/previousVersion> <hash://sha256/b9864a749112cad2fe19e62bf5d8bad580a7036d363d16d81d5c16be325fa0fd> .
<hash://sha256/668d5d6e9c9e7ddb410073ff75eb7f2935c60cc62944ba1fd96ca60feec4a103> <http://purl.org/pav/previousVersion> <hash://sha256/09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2> .
<hash://sha256/6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7> <http://purl.org/pav/previousVersion> <hash://sha256/668d5d6e9c9e7ddb410073ff75eb7f2935c60cc62944ba1fd96ca60feec4a103> .
<hash://sha256/d79fb9207329a2813b60713cf0968fda10721d576dcb7a36038faf18027eebc1> <http://purl.org/pav/previousVersion> <hash://sha256/6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7> .
If you retrieved data files, you can check the integrity of the extracted archive by confirming that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.
$ java -jar preston.jar verify
hash://sha256/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362 file:/home/preston/preston-archive/data/3e/ff/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362 OK CONTENT_PRESENT_VALID_HASH 89931
hash://sha256/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b file:/home/preston/preston-archive/data/18/48/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b OK CONTENT_PRESENT_VALID_HASH 210344
hash://sha256/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02 file:/home/preston/preston-archive/data/18/46/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02 OK CONTENT_PRESENT_VALID_HASH 210344
hash://sha256/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38 file:/home/preston/preston-archive/data/55/4f/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38 OK CONTENT_PRESENT_VALID_HASH 202701
Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".
Files in this data publication:
--- start of file descriptions ---
-- description of archive and its contents (this file) --
README
-- executable java jar containing preston[2] v0.1.8. --
preston.jar
-- individual provenance index files --
049b0eb995b484c1e64184f582f51b3c608dcade70c4aefc2d53f903bae45098
073315c32d7fd19868449bef1b11b15a86981dee53a31f7f5c882f7e3be413c3
1172c6927e58113db668409d36b6a2cd84cf1a93e85b50d65d0bd008a5d8aaa4
1707cb11cd9f696f1a86fd06742c1e14fad856747be88791f79f6fc7c979d5a6
272ff1f12a573c667634d934d06b8bab0dd9cc6558795287ea99fab87620d005
2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a
37b8b636e939072d0df7246bf077ead4279f9dd33929be322e631104b0641308
3901b6af522d535fb164823704686e72f73b7798a2a64eaeb817134552c69e2c
395ed0c95a624f8853116442690965acf69151acd6b33cc4fc710f567828f784
460c14ed0129c1469c9149ed1030cdc133f110fb32048748323982cb88dd7eda
477b6c4e9ecf5c8cd1b5502e0245c8622fa4b358f6710f97db39b473ed3d8235
52b7274f5d795e4987964bb1a327dd6d6e4f65870e6a7aac172481d0ba3013d4
54786bde04751bc31bf38c9e89c010cfee7de91760e1f5f31218ff11acff8a70
6135b237a49b37b857801836494f2c36bcb1526bdacf001a9d11727fff6bf1f1
69b4d5ca9643c14501a48a2b1eb24971a6da68da5033c304f7f00b94e16a11d9
70066ea7c6a9dd6c2193cdc90b3b1ff7664af235ab245f6c03d1dd497b376570
7084702f8025c99a6608a3355ccad5ff5e644ad544121f5d524961f7fe29ceb6
7ebb008412baaac3afcc8af68b796bf4ca98f367cfd61a815eee82cdffeab196
886edb8d22973bb04fe3b42d12106029a00b9deab3fb77d8787123327b77ae3b
8a6d7e2ab026ff56380235fd9696f5e538e5e426b9374f2ddf3a705e186a7788
95f88f27ed3448534206406738dfb5c5030fe3d6883c6dda261649357600883f
9d12cae409e8ea0a546f7945cc629d622400000c3338e4710d9c6084fca9274d
9fa9ea50db419c75251026708183add8973d9e68a79062f7808b110bef21006e
a24abbe089556f51fe9c2a51febdcaf893b419556312bcc63515713fc4a52922
a3b0477fe46f09b0f51c0f651691665c149bc341f5c19996675d849252e86453
a486474333f05884580dd10c54c95999063c7d1bc22e2cbe3bead604aca0a183
a524b9af3f172793998e1f9c5c0e9c949cc935624a17ed3364d32bc0391c9382
aa0e508aeb96f240b551fe92ff4224325ddcdf66f97eef95ac78aec62e53a169
ab34300942ec02cca7adf2744f6fbc1ab7587060bea09ef92b65b66f89d1ddcd
b05d4a17d9a02180669d7eb017102dd1a739fb4615759cba94baf944b2aee29c
b37c79f95c22fc4d657cc89dedd7a870923285da690ad4f5121962492484a142
be6d8cd5f1405a5e3e8aa492fb8dab41f6521608834d746e6cbc58d2f550f918
c06f4413a97a5540fbdd40bdbfb194435c154533df7fe388dfdd378084e19c3d
c585b8addfb7f7991ad74c0bae158aecefc6be5b11c28b020135e0f13040e187
c66587e9730a6f68e961240038892df656ea99a1a25f4ff8ce556c07b09a4878
cea1aab236de5de8da8954797d846c225bf2ad4f8fe3cd413e60ab029f9e1b3e
da05cc27a47e755ebe912fafae434df5bd31a5d92658fe1943acc0a2023fab32
fcb2ee4d630a9a1440417b0c46da5bc1578a388d6aedd12189a23283b60dde7d
ff32a7cbc99eaf6b67695fd94284a9b1b47a76497ef4d10ffc4dae199cc0d7c3
--- individual provenance logs --
05a877bdb8617144fe166a13bf51828d4ad1bc11631c360b9e648a9f7df2bbcd
09574d9c1330c2b1bec9b7bf3a55ab9273bedbfed78affd70a058a1a25e052d2
0e6b49850d96b4b58ea3759ecea45d273a48f074c4edaaec5e008791d7718781
102cbfb1e800ef795ba1e1c51a34bff9b463b34c9443435069ddc76970c1e9c9
1d3817d9cb9fc7de7a3b7a4181daba8de1e52b348280154e8a163c7dd7ee1a7e
20d36a6f879ba1dd797d4288a4f2e32719d3c674156194c2765a3ec6b43f5e17
24b3f981c88c747f44ad3372095767cd15dcf81bd6cd2e54328a90a21409df43
261177a96185166f1c301beacf7350abff03d1b5710be6bfd8c4aff9caffef12
3dd674b7ad16391629948981a9cb6f6f86937d016861c3e59cd6e6bf3589f3b7
3e7f19a8a78b51437240f49c499e6e7f89b8d58d4e3ceb9480d4356721645cee
480868b59e95f3ce2324a7308dba65795e857d34cfbdcea7440a6f2620c6fbf6
58daa9a51e5dc0911163aa1b98d68c801106734cd29eab9980814057351aeb70
5a39b7bbe9d1bc46ed2eb7bd76c490b5c85a09369a7cf7dc18fa04532679e9a7
5c469224fa0b6159bf33a59ddaa0246634e81bddd1728e7bf3540745055eccfa
6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7
650a28fff3e03dadba70dc05a34c580c04203380187953fa4a2fb778353fee79
668d5d6e9c9e7ddb410073ff75eb7f2935c60cc62944ba1fd96ca60feec4a103
7801a034fe3c7920e032d2338a690b700ca41a90a92d878fc3a67111cad16d29
7efdea9263e57605d2d2d8b79ccd26a55743123d0c974140c72c8c1cfc679b93
8c0752dc6425b9c716837c9713ce284158b4cff70a1e66be2beb0677018831f4
9a41d2583f0b8169ffdd44fb2d3a5e057eba4a10e5d9193d0c6e9dcf07c3119e
a0a18b0e32f933112084b846863438038f66f63eeeb22fa9d8d734e8a25bb208
a7a5e7c6a4b21bdf67f48d6bea85f438b8133f674027b04625dfadec3ff985f6
af0bb2c89571a30815d4488e72dede84a2ffc102bb87961f06884509fd5d1dae
af8f9ed321d9c403617f54a96e3217adc918970fbbfe8b8715359669f4890b63
b5a30bbd8d51e9faf08d4ddebbc5bda9bab1b12545172f1524ac5ebdb0038bd4
b83cf099449dae3f633af618b19d05013953e7a1d7d97bc5ac01afd7bd9abe5d
b9864a749112cad2fe19e62bf5d8bad580a7036d363d16d81d5c16be325fa0fd
ba02b235fd445904eae45b50bc637a195f25e9ca1637bcf26b2dc7f8698aa1fe
c1b50502b1ca87046eeb7fe4863d0cf9319b6645ff2142db69f21b4cc23332b6
c253a5311a20c2fc082bf9bac87a1ec5eb6e4e51ff936e7be20c29c8e77dee55
d79fb9207329a2813b60713cf0968fda10721d576dcb7a36038faf18027eebc1
d99fa37caa268f8061980001146ed2a566e814d0740bb1974b76847512be95d3
dc293e26154b89273791b9674d81110029f987c686b386184d0b66a5b95f9cda
e4e5736e8bfec6c686eedde4c6dfa62845930d04e12dfa6f8a7d70abc3d087df
e69d186ff3be11830c2da67d1bfeb896ec6398fc9d555fa26eaae1baa54450fb
eb2c716ec85158a0785216de1b09965173fc368d12f213c1bf747bbc2e49c6a6
f3ed6aa1bd15ee43d05e138b935040aaa745f6ca8c7e8f2dfbb0a3ae0df66f36
fd27b0552c8a6800a8b3b1b822a2063a3215c1d9887badad09a62746b80846bc
--- end of file descriptions ---
References
[1] Global Biodiversity Information Facility, Integrated Digitized Biocollections, Biological Collection Access Service (GBIF, iDigBio, BioCASe, https://gbif.org,https://idigbio.org,http://biocase.org) accessed from 2018-09-03 to 2019-10-02 with provenance hash://sha256/6387c9ebed9507a0fbba2d161e83c2da73e0d6fa6dd51fb19ac4a4ca75b839c7.
[2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 .
This work is funded in part by grant NSF OAC 1839201 from the National Science Foundation