TY - JOUR
T1 - Accurate & simple open-sourced no-code machine learning and CDFT predictive models for the antioxidant activity of phenols
AU - Halabi Diaz, Andrés
AU - Galdames, Franco
AU - Velásquez, Patricia
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/9
Y1 - 2024/9
N2 - Phenolic compounds (PC) are important antioxidant biomolecules for medicine and foods industries. The DDPH test is used for testing antioxidant capacity. A fully No-Code methodology is presented for building QSPR models for the anti-DPPH activity of 202 PC. Machine learning (ML) algorithms were used for dimensionality reduction (PCA, InfoGain, GainRatio, CfsSubset) and predictive model training (J48, RandomTree, JCHAID*). Conceptual Density Functional Theory (CDFT) descriptors are calculated at the GFN1-xTB and GFN2-xTB levels of theory and the resulting global reactivity descriptors are used to train the ML models. The obtained Decision Tree (DT) models all present over 85% accuracy and Substantial Agreement with Reality, both for the internal and external validation. All the developed models adhere to the OECD guidelines for regulatory QSPR developments and are discussed in a mechanistic context. This research presents a novel, simple and codeless methodology for developing highly precise predictive models for the anti-DPPH activity of PC, successfully bridging the gap between experimental chemistry, theoretical physical chemistry, and ML.
AB - Phenolic compounds (PC) are important antioxidant biomolecules for medicine and foods industries. The DDPH test is used for testing antioxidant capacity. A fully No-Code methodology is presented for building QSPR models for the anti-DPPH activity of 202 PC. Machine learning (ML) algorithms were used for dimensionality reduction (PCA, InfoGain, GainRatio, CfsSubset) and predictive model training (J48, RandomTree, JCHAID*). Conceptual Density Functional Theory (CDFT) descriptors are calculated at the GFN1-xTB and GFN2-xTB levels of theory and the resulting global reactivity descriptors are used to train the ML models. The obtained Decision Tree (DT) models all present over 85% accuracy and Substantial Agreement with Reality, both for the internal and external validation. All the developed models adhere to the OECD guidelines for regulatory QSPR developments and are discussed in a mechanistic context. This research presents a novel, simple and codeless methodology for developing highly precise predictive models for the anti-DPPH activity of PC, successfully bridging the gap between experimental chemistry, theoretical physical chemistry, and ML.
KW - Antioxidant mechanism
KW - Artificial intelligence
KW - CDFT
KW - DPPH
KW - ML
KW - Phenolic compounds
KW - XAI
UR - http://www.scopus.com/inward/record.url?scp=85199519968&partnerID=8YFLogxK
U2 - 10.1016/j.comptc.2024.114782
DO - 10.1016/j.comptc.2024.114782
M3 - Article
AN - SCOPUS:85199519968
SN - 2210-271X
VL - 1239
JO - Computational and Theoretical Chemistry
JF - Computational and Theoretical Chemistry
M1 - 114782
ER -