skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: SVR Chemometrics to Quantify β-Lactoglobulin and α-Lactalbumin in Milk Using MIR
Protein content variation in milk can impact the quality and consistency of dairy products, necessitating access to in-line real time monitoring. Here, we present a chemometric approach for the qualitative and quantitative monitoring of β-lactoglobulin and α-lactalbumin, using mid-infrared spectroscopy (MIR). In this study, we employed Hotelling T2 and Q-residual for outlier detection, automated preprocessing using nippy, conducted wavenumber selection with genetic algorithms, and evaluated four chemometric models, including partial least squares, support vector regression (SVR), ridge, and logistic regression to accurately predict the concentrations of β-lactoglobulin and α-lactalbumin in milk. For the quantitative analysis of these two whey proteins, SVR performed the best to interpret protein concentration from 197 MIR spectra originating from 42 Cornell University samples of preserved pasteurized modified milk. The R2 values obtained for β-lactoglobulin and α-lactalbumin using leave one out cross-validation (LOOCV) are 92.8% and 92.7%, respectively, which is the highest correlation reported to date. Our approach introduced a combination of preprocessing automation, genetic algorithm-based wavenumber selection, and used Optuna to optimize the framework for tuning hyperparameters of the chemometric models, resulting in the best chemometric analysis of MIR data to quantitate β-lactoglobulin and α-lactalbumin to date.  more » « less
Award ID(s):
2345069
PAR ID:
10553692
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Foods
Volume:
13
Issue:
1
ISSN:
2304-8158
Page Range / eLocation ID:
166
Subject(s) / Keyword(s):
chemometrics support vector regression partial least squares mid-infrared spectroscopy whey proteins Kennard-Stones
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundMaternal anemia has adverse consequences for the mother‐infant dyad. To evaluate whether and how milk nutrient content may change in ways that could “buffer” infants against the conditions underlying maternal anemia, this study assessed associations between milk macronutrients and maternal iron‐deficiency anemia (IDA), non‐iron‐deficiency anemia (NIDA), and inflammation. MethodsA secondary analysis of cross‐sectional data and milk from northern Kenya was conducted (n = 204). The combination of hemoglobin and transferrin receptor defined IDA/NIDA. Elevated serum C‐reactive protein defined acute inflammation. The effects of IDA, NIDA, and inflammation on milk macronutrients were evaluated in regression models. ResultsIDA (β = 0.077,p =.022) and NIDA (β = 0.083,p =.100) predicted higher total protein (ln). IDA (β = −0.293,p =.002), NIDA (β = −0.313,p =.047), and inflammation (β = −0.269,p =.007) each predicted lower fat (ln); however, anemia accompanying inflammation predictedhigherfat (β = 0.655,p =.007 for IDA and β = 0.468,p =.092 for NIDA). NIDA predicted higher lactose (β = 1.020,p =.003). ConclusionsMilk macronutrient content both increases and decreases in the presence of maternal anemia and inflammation, suggesting a more complicated and dynamic change than simple impairment of nutrient delivery during maternal stress. Maternal fat delivery to milk may be impaired under anemia. Mothers may buffer infant nutrition against adverse conditions or poor maternal health by elevating milk protein (mothers with IDA/NIDA), lactose (mothers with NIDA), or fat (mothers with anemiaandinflammation). This study demonstrates the foundational importance of maternal micronutrient health and inflammation or infection for advancing the ecological understanding of human milk nutrient variation. 
    more » « less
  2. Roman Bartak and Fazel Keshtkar and Michael Franklin (Ed.)
    This paper presents a novel method to automatically assess self-explanations generated by students during code comprehension activities. The self-explanations are produced in the context of an online learning environment that asks students to freely explain Java code examples line-by-line. We explored a number of models consisting of textual features in conjunction with machine learning algorithms such as Support Vector Regression (SVR), Decision Trees (DT), and Random Forests (RF). Support Vector Regression (SVR) performed best having a correlation score with human judgments of 0.7088. The best model used a combination of features such as semantic measures obtained using a Sentence BERT pre-trained model and from previously developed semantic algorithms used in a state-of-the-art intelligent tutoring system. 
    more » « less
  3. Abstract ObjectivesThis study explored differing levels of macronutrients in breast milk in relation to maternal anemia and hemoglobin. MethodsArchived milk specimens and data from a cross‐sectional sample of 208 breastfeeding mothers in northern Kenya, originally collected in 2006, were analyzed; data included milk fat, maternal hemoglobin concentration, and anemia status (anemia defined as hemoglobin <12 g/dL). Total protein and lactose were measured and energy was calculated. To explore the association between milk outcomes (fat, protein, lactose, and energy) and anemia, regression models were constructed with and without adjustment for maternal age, parity, and time (days) postpartum. The same models were constructed using hemoglobin as a continuous predictor in lieu of dichotomous anemia to explore the role of hemoglobin levels and anemia severity in predicting milk outcomes. ResultsThe group comparison indicated significantly higher milk protein and lower milk fat for anemic mothers relative to nonanemic counterparts. After adjustment for maternal age, parity, and time postpartum, maternal anemia was associated with significantly higher milk protein (P = 0.001) and significantly lower milk fat (P = 0.025). Hemoglobin had a significant inverse relationship with milk protein (P = 0.017) and a marginally significant positive relationship with milk fat (P = 0.060) after adjusting for the maternal variables. Neither anemia nor hemoglobin was significant in predicting lactose or milk energy. ConclusionsMaternal anemia and hemoglobin concentration may be associated with complex changes in milk macronutrients. Future research should clarify the impact of maternal anemia on a range of breast milk components while accounting for other maternal characteristics. 
    more » « less
  4. Bogomolov, Sergiy; Parker, David (Ed.)
    Resiliency is the ability to quickly recover from a violation and avoid future violations for as long as possible. Such a property is of fundamental importance for Cyber-Physical Systems (CPS), and yet, to date, there is no widely agreed-upon formal treatment of CPS resiliency. We present an STL-based framework for reasoning about resiliency in CPS in which resiliency has a syntactic characterization in the form of an STL-based Resiliency Specification (SRS). Given an arbitrary STL formula φ, time bounds α and β, the SRS of φ, Rα,β (φ), is the STL formula ¬φU[0,α]G[0,β)φ, specifying that recovery from a violation of φ occur within time α (recoverability), and subsequently that φ be maintained for duration β (durability). These R-expressions, which are atoms in our SRS logic, can be combined using STL operators, allowing one to express composite resiliency specifications, e.g., multiple SRSs must hold simultaneously, or the system must eventually be resilient. We define a quantitative semantics for SRSs in the form of a Resilience Satisfaction Value (ReSV) function r and prove its soundness and completeness w.r.t. STL’s Boolean semantics. The r-value for Rα,β (φ) atoms is a singleton set containing a pair quantifying recoverability and durability. The r-value for a composite SRS formula results in a set of non-dominated recoverability-durability pairs, given that the ReSVs of subformulas might not be directly comparable (e.g., one subformula has superior durability but worse recoverability than another). To the best of our knowledge, this is the first multi-dimensional quantitative semantics for an STL-based logic. Two case studies demonstrate the practical utility of our approach. https://doi.org/10.1007/978-3-031-15839-1_7 
    more » « less
  5. null (Ed.)
    Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets. 
    more » « less