Abstract
Prediction of catalytic reaction efficiency is one of the most intriguing and challenging applications of machine learning (ML) algorithms in chemistry. In this study, we demonstrated a strategy for utilizing ML protocols applied to Quantum Theory of Atoms In Molecules (QTAIM) parameters to predict the ability of the A17 L47K catalytic antibody to covalently capture organophosphate pesticides. We found that the novel “composite” DFT functional B97-3c could be effectively employed for fast and accurate initial geometry optimization, aligning well with the input dataset creation. QTAIM descriptors proved to be well-established in describing the examined dataset using density-based and hierarchical clustering algorithms. The obtained clusters exhibited correlations with the chemical classes of the input compounds. The precise physical interpretation of the QTAIM properties simplifies the explanation of feature impact for both supervised and unsupervised ML protocols. It also enables acceleration in the search for entries with desired properties within large databases. Furthermore, our findings indicated that Ridge Regression with Laplacian kernel and CatBoost Regressor algorithms demonstrated suitable performance in handling small datasets with non-trivial dependencies. They were able to predict the actual reaction barrier values with a high level of accuracy. Additionally, the CatBoost Classifier proved reliable in discriminating between “active” and “inactive” compounds.