Article

Data-driven prediction of CIMS signal intensity for pesticide detection via dibromomethane reagent using machine learning techniques

Chaoqun WenChangde CollegeShaxnoza SaydaxmetovaChongqing Institute of Engineering, Computing and Internet of Things University, Chongqing, ChinaSuranjana V. Mayani‎Marwadi University Research Center, Department of Chemistry, Faculty of Science, Marwadi University, Rajkot, Gujarat, IndiaSuhas BallalJAIN (Deemed to be University)Harshit GuptaCentre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, IndiaSubhashree RayDepartment of Biochemistry, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, IndiaAashna SinhaUttaranchal UniversityVikasdeep Singh MannDepartment of Mechanical Engineering, Chandigarh Engineering College, Chandigarh Group of Colleges-Jhanjeri, Mohali, Punjab, IndiaAhamad AlewedeeThe Islamic UniversityIssa Mohammed KadhimAl-Nisour University CollegeAseel SmeratAl-Ahliyya Amman UniversityZarghuna HekmatyarNangarhar University

European Journal of Mass Spectrometryjournal2026en

ABI

Abstract

This research introduces a model utilizing machine learning for forecasting Chemical Ionization Mass Spectrometry (CIMS) signal intensity in pesticide detection, using dibromomethane (DBrMe) as a reagent. Accurate detection of pesticides is crucial for agricultural safety and compliance. The model explores the relationship between signal intensity and ten molecular features, including molar mass, COO, N-O, N-N, N-S, C-C, S, Cl, P, and pesticide concentration in DBrMe (ppm), using algorithms like Decision Tree, AdaBoost, Random Forest, and Ensemble Learning. A dataset of 2460 samples was used for training and validation. Among the features, pesticide concentration had the strongest influence, followed by N-O, COO, and molar mass. SHAP analysis confirmed these trends, while a Leverage-based method was used to identify and remove outliers, improving model reliability. Random Forest outperformed other models, achieving the highest R 2 (0.401) and lowest error. In contrast, Decision Tree and AdaBoost showed overfitting issues. Sensitivity analysis demonstrated that all variables contribute to the prediction, highlighting the model's robustness. This approach offers a cost-effective, accurate alternative to traditional experimental methods for estimating CIMS signal intensity across various pesticides and conditions, supporting faster and more efficient chemical analysis in agricultural monitoring.

Topics

Pesticide Residue Analysis and Safety Water Quality Monitoring and Analysis Spectroscopy and Chemometric Analyses

Identifiers

DOI: 10.1177/14690667251411711

Citations and references

Cited by 027 references

Metrics — AkademScholar · Coming soon