Per- and polyfluoroalkyl substances (PFAS) contamination has posed a significant environmental and public health challenge due to their ubiquitous nature. Adsorption has emerged as a promising remediation technique, yet optimizing adsorption efficiency remains complex due to the diverse physicochemical properties of PFAS and the wide range of adsorbent materials. Traditional modeling approaches, such as response surface methodology (RSM), struggled to capture nonlinear interactions, while standalone machine learning (ML) models required extensive datasets. This study addressed these limitations by developing hybrid RSM-ML models to improve the prediction and optimization of PFAS adsorption. A comprehensive dataset was constructed using experimental adsorption data, integrating key parameters such as pH, pHpzc, surface area, temperature, and PFAS molecular properties. RSM was employed to model adsorption behavior, while gradient boosting (GB), random forest (RF), and extreme gradient boosting (XGB) were used to enhance predictive performance. Hybrid models—linear, RMSE-based, multiplicative, and meta-learning—were developed and evaluated. The meta-learning HOP-RSM-GB model achieved near-perfect accuracy (R² = 1.00, RMSE = 10.59), outperforming all other models. Surface plots revealed that low pH and high pHpzc maximized the adsorption while increasing log Kow consistently enhanced PFAS adsorption. These findings establish hybrid RSM-ML modeling as a powerful framework for optimizing PFAS remediation strategies. The integration of statistical and machine learning approaches significantly improves predictive accuracy, reduces experimental costs, and provides deeper insights into adsorption mechanisms. This study underscores the importance of data-driven approaches in environmental engineering and highlights future opportunities for integrating ML-driven modeling with experimental adsorption research.
more »
« less
Data Driven Machine Learning for Estimation of PFAS Partitioning on Various Surface Materials
This study, data driven machine learning model was developed to estimate the partitioning of Per- and Poly-fluoroalkyl Substances (PFAS) compounds during aqueous adsorption on various adsorbent materials with a vision to potentially replace the time-consuming and labor-intensive adsorption experiments. Various regression models were trained and tested using previously published data. 290 data points and 170 data points for activated carbon and mineral adsorbents, respectively, were mined for training the models and 10 data points were used to test the trained models. Statistical parameters, such as Root-Mean-Square Error (RSME), R-Squared, Mean Average Error (MAE), Mean Squared Error (MSE), etc., were used to compare the regression models. It was found that rational quadratic GPR (R-squared = 0.9966) and fine regression tree (R-Squared = 0.9427) models had the highest estimation accuracy for carbon-based and mineral-based adsorbents, respectively. These models were then validated for prediction accuracy using 10 data points from previous studies as an outer test set. Rational quadratic GPR was able to achieve 99% prediction accuracy for carbon-based adsorbent, while fine tree regression model was able to achieve 94% prediction accuracy. Despite such high estimation accuracy, the data mining process revealed the data shortage and the need for more research on PFAS adsorption to present real-world models. This study, as one of the first, shed a light on the determination of key parameters in aquatic chemistry with data mining and machine learning approaches.
more »
« less
- Award ID(s):
- 2216148
- PAR ID:
- 10447093
- Date Published:
- Journal Name:
- AEESP Research and Education Conference 2023
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With drinking water regulations forthcoming for per- and polyfluoroalkyl substances (PFAS), the need for cost-effective treatment technologies has become urgent. Adsorption is a key process for removing or concentrating PFAS from water; however, conventional adsorbents operated in packed beds suffer from mass transfer limitations. The objective of this study was to assess the mass transfer performance of a porous polyamide adsorptive membrane for removing PFAS from drinking water under varying conditions. We conducted batch equilibrium and dynamic adsorption experiments for perfluorooctanesulfonic acid, perfluorooctanoic acid, perfluorobutanesulfonic acid, and undecafluoro-2-methyl-3-oxahexanoic acid (i.e., GenX). We assessed various operating and water quality parameters, including flow rate (pore velocity), pH, ionic strength (IS), and presence of dissolved organic carbon. Outcomes revealed that the porous adsorptive membrane was a mass transfer-efficient platform capable of achieving dynamic capacities similar to equilibrium capacities at fast interstitial velocities. The adsorption mechanism of PFAS to the membrane was a mixture of electrostatic and hydrophobic interactions, with pH and IS controlling which interaction was dominant. The adsorption capacity of the membrane was limited by its surface area, but its site density was approximately five times higher than that of granular activated carbon. With advances in molecular engineering to increase the capacity, porous adsorptive membranes are well suited as alternative adsorbent platforms for removing PFAS from drinking water.more » « less
-
High-throughput molecular simulations and machine learning (ML) have been implemented to adequately screen a large number of metal−organic frameworks (MOFs) for applications involving adsorption. Grand canonical Monte Carlo (GCMC) simulations have proven effective in calculating the adsorption capacity at given pressures and temperatures, but they can require expensive computational resources. While they can be resource-efficient, ML models can require large datasets, creating a need for algorithms that can efficiently characterize adsorption; active learning (AL) can play a very important role in this regard. In this work, we make use of Gaussian process regression (GPR) to model pure component adsorption of nitrogen at 77 K from 10−5 to 1 bar, methane at 298 K from 10 −5 to 100 bar, carbon dioxide at 298 K from 10−5 to 100 bar, and hydrogen at 77 K from 10−5 to 100 bar on PCN-61, MgMOF-74, DUT-32, DUT-49, MOF-177, NU-800, UiO-66, ZIF-8, IRMOF-1, IRMOF-10, and IRMOF-16. The GPR model requires an initial training of the model with an initial dataset, the prior one, and, in this study of evaluating AL, we make use of three different prior selection schemes. Each prior scheme is updated with a sampling point resulting from the GP model uncertainties. This protocol continues until a maximum GPR relative error of 2% is attained. We make a recommendation on the best prior selection scheme for the total 44 adsorbate−adsorbent pairs primarily making use of the mean absolute error and the total amount of points required for convergence of the model. To further evaluate the AL framework, we apply the BET consistency criteria on the simulated and GP nitrogen isotherms and compare the resulting surface areas.more » « less
-
Abstract An intelligent sensing framework using Machine Learning (ML) and Deep Learning (DL) architectures to precisely quantify dielectrophoretic force invoked on microparticles in a textile electrode-based DEP sensing device is reported. The prediction accuracy and generalization ability of the framework was validated using experimental results. Images of pearl chain alignment at varying input voltages were used to build deep regression models using modified ML and CNN architectures that can correlate pearl chain alignment patterns of Saccharomyces cerevisiae(yeast) cells and polystyrene microbeads to DEP force. Various ML models such as K-Nearest Neighbor, Support Vector Machine, Random Forest, Neural Networks, and Linear Regression along with DL models such as Convolutional Neural Network (CNN) architectures of AlexNet, ResNet-50, MobileNetV2, and GoogLeNet have been analyzed in order to build an effective regression framework to estimate the force induced on yeast cells and microbeads. The efficiencies of the models were evaluated using Mean Absolute Error, Mean Absolute Relative, Mean Squared Error, R-squared, and Root Mean Square Error (RMSE) as evaluation metrics. ResNet-50 with RMSPROP gave the best performance, with a validation RMSE of 0.0918 on yeast cells while AlexNet with ADAM optimizer gave the best performance, with a validation RMSE of 0.1745 on microbeads. This provides a baseline for further studies in the application of deep learning in DEP aided Lab-on-Chip devices.more » « less
-
Against the backdrop of the ever-evolving IT industry, this comparative study explores the differences among various project management methods, highlighting key distinctions between Agile and traditional approaches by evaluating the benefits of Agile and the drawbacks of not adopting agile methods. Agile practices have gained recognition for their adaptability and efficiency, in addressing dynamic industry demands. Our multifaceted approach, which examines the pros and cons of Agile methodologies across various industries employs different machine learning algorithms—logistic regression, linear regression, and decision tree regressor. The study quantitatively measures Agile’s impact compared to other methodologies using prediction probabilities, classifications, confusion metrics, R-squared, and Mean Squared Error (MSE) for performance analysis. Results highlight that linear regression outperforms other models with 71% accuracy and 82% precision. These findings offer valuable insights into understanding Agile’s impact on IT industries, encouraging further exploration and refinements to make informed decisions on project management strategies and fostering future research to enhance IT project success rates.more » « less
An official website of the United States government

