skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Instrument Bias Correction With Machine Learning Algorithms: Application to Field-Portable Mass Spectrometry
In situ sensors for environmental chemistry promise more thorough observations, which are necessary for high confidence predictions in earth systems science. However, these can be a challenge to interpret because the sensors are strongly influenced by temperature, humidity, pressure, or other secondary environmental conditions that are not of direct interest. We present a comparison of two statistical learning methods—a generalized additive model and a long short-term memory neural network model for bias correction of in situ sensor data. We discuss their performance and tradeoffs when the two bias correction methods are applied to data from submersible and shipboard mass spectrometers. Both instruments measure the most abundant gases dissolved in water and can be used to reconstruct biochemical metabolisms, including those that regulate atmospheric carbon dioxide. Both models demonstrate a high degree of skill at correcting for instrument bias using correlated environmental measurements; the difference in their respective performance is less than 1% in terms of root mean squared error. Overall, the long short-term memory bias correction produced an error of 5% for O 2 and 8.5% for CO 2 when compared against independent membrane DO and laser spectrometer instruments. This represents a predictive accuracy of 92–95% for both gases. It is apparent that the most important factor in a skillful bias correction is the measurement of the secondary environmental conditions that are likely to correlate with the instrument bias. These statistical learning methods are extremely flexible and permit the inclusion of nearly an infinite number of correlates in finding the best bias correction solution.  more » « less
Award ID(s):
1744562
PAR ID:
10286624
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Frontiers in Earth Science
Volume:
8
ISSN:
2296-6463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Background Third-generation single molecule sequencing technologies can sequence long reads, which is advancing the frontiers of genomics research. However, their high error rates prohibit accurate and efficient downstream analysis. This difficulty has motivated the development of many long read error correction tools, which tackle this problem through sampling redundancy and/or leveraging accurate short reads of the same biological samples. Existing studies to asses these tools use simulated data sets, and are not sufficiently comprehensive in the range of software covered or diversity of evaluation measures used. Results In this paper, we present a categorization and review of long read error correction methods, and provide a comprehensive evaluation of the corresponding long read error correction tools. Leveraging recent real sequencing data, we establish benchmark data sets and set up evaluation criteria for a comparative assessment which includes quality of error correction as well as run-time and memory usage. We study how trimming and long read sequencing depth affect error correction in terms of length distribution and genome coverage post-correction, and the impact of error correction performance on an important application of long reads, genome assembly. We provide guidelines for practitioners for choosing among the available error correction tools and identify directions for future research. Conclusions Despite the high error rate of long reads, the state-of-the-art correction tools can achieve high correction quality. When short reads are available, the best hybrid methods outperform non-hybrid methods in terms of correction quality and computing resource usage. When choosing tools for use, practitioners are suggested to be careful with a few correction tools that discard reads, and check the effect of error correction tools on downstream analysis. Our evaluation code is available as open-source at https://github.com/haowenz/LRECE . 
    more » « less
  2. Abstract In shallow coastal systems, sediments are exposed to dramatic and complex variability in environmental conditions that influences sediment processes on short timescales. Sediment oxygen demand (SOD), or consumption of oxygen by sediment‐dwelling organisms and chemical reactions within sediments, is one such process and an important metric of aquatic ecosystem functioning and health. The most common instruments used to measure SOD in situ are batch‐style benthic chambers, which generally require long measurement periods to resolve fluxes and thus do not capture the high temporal variability in SOD that can be driven by dynamic coastal processes. These techniques also preclude linking changes in SOD through time to specific features of the sediment, for example, shifts in sediment faunal activities which can vary on short time scales and can also be affected by ambient oxygen concentrations. Here we present an in situ semi‐flow through instrument to repeatedly measure SOD in discrete areas of sediment. The system isolates patches of sediment in replicate benthic chambers, and measures and records oxygen decrease for a short time before refreshing the overlying water in the chamber with water from the external environment. This results in a sawtooth pattern in which each tooth is an incubation, providing an automated method to produce direct measurements of in situ SOD that can be directly linked to an area of sediment and related to rapid shifts in environmental conditions. 
    more » « less
  3. Abstract Sensors that use ultraviolet (UV) light absorption to measure nitrate in seawater at in situ temperatures require a correction to the calibration coefficients if the calibration and sample temperatures are not identical. This is mostly due to the bromide molecule, which absorbs more UV light as temperature increases. The current correction applied to in situ ultraviolet spectrophotometer (ISUS) and submersible ultraviolet nitrate analyzer (SUNA) nitrate sensors generally follows Sakamoto et al. (2009, Limnol. Oceanogr. Methods 7, 132–143). For waters warmer than the calibration temperature, this correction model can lead to a 1–2 μmol kg−1positive bias in nitrate concentration. Here we present an updated correction model, which reduces this small but noticeable bias by at least 50%. This improved model is based on additional laboratory data and describes the temperature correction as an exponential function of wavelength and temperature difference from the calibration temperature. It is a better fit to the experimental data than the current model and the improvement is validated using two populations of nitrate profiles from Biogeochemical Argo floats navigating through tropical waters. One population is from floats equipped with ISUS sensors while the other arises from floats with SUNA sensors on board. Although this model can be applied to both ISUS and SUNA nitrate sensors, it should not be used for OPUS UV nitrate sensors at this time. This new approach is similar to that used for OPUS sensors (Nehir et al., 2021, Front. Mar. Sci. 8, 663800) with differing model coefficients. This difference suggests that there is an instrumental component to the temperature correction or that there are slight differences in experimental methodologies. 
    more » « less
  4. In this study, we predicted the log returns of the top 10 cryptocurrencies based on market cap, using univariate and multivariate machine learning methods such as recurrent neural networks, deep learning neural networks, Holt’s exponential smoothing, autoregressive integrated moving average, ForecastX, and long short-term memory networks. The multivariate long short-term memory networks performed better than the univariate machine learning methods in terms of the prediction error measures. 
    more » « less
  5. null (Ed.)
    Abstract. Sediment–water oxygen fluxes are widely used as a proxy fororganic carbon production and mineralization at the seafloor. In situ fluxescan be measured non-invasively with the aquatic eddy covariance technique,but a critical requirement is that the sensors of the instrument are able tocorrectly capture the high-frequency variations in dissolved oxygenconcentration and vertical velocity. Even small changes in sensorcharacteristics during deployment as caused, e.g. by biofouling can result inerroneous flux data. Here we present a dual-optode eddy covarianceinstrument (2OEC) with two fast oxygen fibre sensors and document howerroneous flux interpretations and data loss can effectively be reduced bythis hardware and a new data analysis approach. With deployments over acarbonate sandy sediment in the Florida Keys and comparison with parallelbenthic advection chamber incubations, we demonstrate the improved dataquality and data reliability facilitated by the instrument and associateddata processing. Short-term changes in flux that are dubious in measurementswith single oxygen sensor instruments can be confirmed or rejected with the2OEC and in our deployments provided new insights into the temporal dynamicsof benthic oxygen flux in permeable carbonate sands. Under steadyconditions, representative benthic flux data can be generated with the 2OECwithin a couple of hours, making this technique suitable for mappingsediment–water, intra-water column, or atmosphere–water fluxes. 
    more » « less