skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: First release of the Pelagic Size Structure database: global datasets of marine size spectra obtained from plankton imaging devices
Abstract. In marine ecosystems, most physiological, ecological, or physical processes are size dependent. These include metabolic rates, the uptake of carbon and other nutrients, swimming and sinking velocities, and trophic interactions, which eventually determine the stocks of commercial species, as well as biogeochemical cycles and carbon sequestration. As such, broad-scale observations of plankton size distribution are important indicators of the general functioning and state of pelagic ecosystems under anthropogenic pressures. Here, we present the first global datasets of the Pelagic Size Structure database (PSSdb), generated from plankton imaging devices. This release includes the bulk particle normalized biovolume size spectrum (NBSS) and the bulk particle size distribution (PSD), along with their related parameters (slope, intercept, and R2) measured within the epipelagic layer (0–200 m) by three imaging sensors: the Imaging FlowCytobot (IFCB), the Underwater Vision Profiler (UVP), and benchtop scanners. Collectively, these instruments effectively image organisms and detrital material in the 7–10 000 µm size range. A total of 92 472 IFCB samples, 3068 UVP profiles, and 2411 scans passed our quality control and were standardized to produce consistent instrument-specific size spectra averaged to 1° × 1° latitude and longitude and by year and month. Our instrument-specific datasets span most major ocean basins, except for the IFCB datasets we have ingested, which were exclusively collected in northern latitudes, and cover decadal time periods (2013–2022 for IFCB, 2008–2021 for UVP, and 1996–2022 for scanners), allowing for a further assessment of the pelagic size spectrum in space and time. The datasets that constitute PSSdb's first release are available at https://doi.org/10.5281/zenodo.11050013 (Dugenne et al., 2024b). In addition, future updates to these data products can be accessed at https://doi.org/10.5281/zenodo.7998799.  more » « less
Award ID(s):
2322676 1655686
PAR ID:
10592015
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Publisher / Repository:
Copernicus Publications
Date Published:
Journal Name:
Earth System Science Data
Volume:
16
Issue:
6
ISSN:
1866-3516
Page Range / eLocation ID:
2971 to 2999
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundComputational cell type deconvolution enables the estimation of cell type abundance from bulk tissues and is important for understanding tissue microenviroment, especially in tumor tissues. With rapid development of deconvolution methods, many benchmarking studies have been published aiming for a comprehensive evaluation for these methods. Benchmarking studies rely on cell-type resolved single-cell RNA-seq data to create simulated pseudobulk datasets by adding individual cells-types in controlled proportions. ResultsIn our work, we show that the standard application of this approach, which uses randomly selected single cells, regardless of the intrinsic difference between them, generates synthetic bulk expression values that lack appropriate biological variance. We demonstrate why and how the current bulk simulation pipeline with random cells is unrealistic and propose a heterogeneous simulation strategy as a solution. The heterogeneously simulated bulk samples match up with the variance observed in real bulk datasets and therefore provide concrete benefits for benchmarking in several ways. We demonstrate that conceptual classes of deconvolution methods differ dramatically in their robustness to heterogeneity with reference-free methods performing particularly poorly. For regression-based methods, the heterogeneous simulation provides an explicit framework to disentangle the contributions of reference construction and regression methods to performance. Finally, we perform an extensive benchmark of diverse methods across eight different datasets and find BayesPrism and a hybrid MuSiC/CIBERSORTx approach to be the top performers. ConclusionsOur heterogeneous bulk simulation method and the entire benchmarking framework is implemented in a user friendly packagehttps://github.com/humengying0907/deconvBenchmarkingandhttps://doi.org/10.5281/zenodo.8206516, enabling further developments in deconvolution methods. 
    more » « less
  2. The NAIRR Pilot Inaugural Annual Meeting was held on February 19-21, 2025 at the Hyatt Regency Crystal City in Arlington, VA, USA. The meeting highlighted resource offerings, AI, science, education and innovation outcomes, and the NAIRR pilot’s progress in democratizing access to AI resources, and its vision for the future of AI research in the United States. Final Report: https://doi.org/10.5281/zenodo.15263283 Proceedings: https://zenodo.org/communities/nairr2025 Program: https://doi.org/10.5281/zenodo.15106915 
    more » « less
  3. https://doi.org/10.5281/zenodo.8097562 
    more » « less
  4. https://doi.org/10.5281/zenodo.10960062 
    more » « less
  5. https://doi.org/10.5281/zenodo.10160680 
    more » « less