Accurate Data-Driven Frameworks to Estimate Solubility of Ammonia in Ionic Liquids
Annotatsiya
The primary goal of estimating NH3 solubility in ionic liquids (ILs) is to develop reliable algorithms capable of accurately capturing complex relationships between key input parameters like pressure, chemical descriptors and temperature, and the target variable, NH3 solubility. Various factors, including temperature, pressure, and the chemical properties of ILs (represented as S1 to S6), can influence NH3 solubility. This study centers on creating different data-centric models by leveraging advanced machine learning methods, including decision trees (DTs), lasso regression, artificial neural networks (ANNs), linear regression, light gradient boosting machines (LightGBM), convolutional neural networks (CNNs), ridge regression, Gaussian processes, random forests (RFs), extreme gradient boosting (XGBoost),elastic net, support vector machines (SVMs), gradient boosting machines (GBMs), categorical boosting (CatBoost), and k-nearest neighbors (KNN). These models are designed to precisely predict solubility of NH3 in ILs. To evaluate their performance, various metrics and graphical methods are employed. The outlier detection algorithm based on Monte Carlo methods confirms that the majority of the dataset, consisting of 785 data points, is appropriate for model creation. The results highlight that Random Forest, ANN, CatBoost, LightGBM, and XGBoost are the most robust and accurate models for predicting NH3 solubility, as demonstrated by their lowest error metrics and highest R-squared values. Sensitivity analysis further reveals that all input variables are interrelated with the target variable. Finally, the game theory investigations by the SHAP revealed that pressure, temperature and S1 are the most influencing factors on solubility of NH3 in ILs.