Skip to main content
AkademIndex

Products

For developers

AkademBasesoonOpen API for the ecosystem
Latin
Article

A robust machine learning framework for predicting CO2 solubility in polyethylene glycol using advanced optimization strategies

Fadhel F. SeadDepartment of Dentistry, College of Dentistry, The Islamic University, Najaf, IraqDharmesh SurMarwadi University Research Center, Department of Chemical Engineering, Faculty of Engineering & Technology, Marwadi University, Rajkot-360003, Gujarat, IndiaAnupam YadavDepartment of Computer engineering and Application, GLA University Mathura-281406, IndiaSuhas BallalDepartment of Chemistry and Biochemistry, School of Sciences, JAIN (Deemed to be University), Bangalore, Karnataka, IndiaAbhayveer SinghCentre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, 140401, Punjab, IndiaT. KrithigaDepartment of CHEMISTRY, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, IndiaSatvik VatsDepartment of Computer Science and Engineering, Graphic Era Hill University, Dehradun, IndiaFarrukh YuldashevDepartment of Informatics and Its Teaching Methods, Tashkent State Pedagogical University, Tashkent, UzbekistanIrfan AhmadDepartment of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Khalid University, Abha, Saudi ArabiaSamim SherzodFaculty of Engineering, Nangarhar University, Nangarhar, Afghanistan
ABI

Abstract

The accurate estimation of carbon dioxide (CO 2 ) solubility in polyethylene glycol (PEG) is essential for optimizing carbon capture and storage processes, yet complex thermodynamic behaviors challenge conventional modeling approaches. This study aims to develop a robust intelligent framework by systematically evaluating the efficacy of Gradient Boosting Decision Trees (GBDT) coupled with four advanced hyperparameter optimization algorithms: Bayesian Batch Optimizer (BBO), Bayesian Probability Improvement (BPI), Gaussian Process Optimizer (GPO), and Evolutionary Strategies (ES). A dataset comprising 161 experimental data points representing diverse temperature, pressure, and PEG molar mass conditions was curated and subjected to rigorous outlier detection using the Hat matrix method. The modeling framework employed a 5-fold cross-validation strategy to assess generalization capabilities, utilizing metrics such as R 2 and MSE. The comparative analysis revealed that while evolutionary and batch-based strategies achieved near-perfect training accuracy, they suffered from significant overfitting. Conversely, the GBDT-GPO hybrid model demonstrated superior robustness, achieving the highest testing coefficient of determination (R 2 = 0.849) and the lowest average absolute relative error (15.74%), effectively balancing bias and variance. Furthermore, SHAP analysis elucidated the model’s decision-making process, confirming pressure as the dominant governing factor followed by molar mass and temperature, aligning with Henry’s law and thermodynamic principles. The proposed GPO-optimized framework offers a reliable computational tool for predicting gas solubility in polymer solvents.

Topics

Identifiers

Citations and references

Cited by 058 references
Metrics — AkademScholar · Coming soon