Machine learning approaches to study the structure-activity relationships of LpxC inhibitors

Authors

  • Tianshi Yu Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand https://orcid.org/0000-0003-2381-2856
  • Li Chuin Chong Beykoz Institute of Life Sciences and Biotechnology, Bezmialem Vakif University, Beykoz, Istanbul, Türkiye https://orcid.org/0000-0002-3574-1365
  • Chanin Nantasenamat Streamlit Open Source, Snowflake Inc., San Mateo, California 94402, United States https://orcid.org/0000-0003-1040-663X
  • Nuttapat Anuwongcharoen Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand https://orcid.org/0000-0001-5361-5314
  • Theeraphon Piacham Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, Phone: +66 2 441 4371; Fax: +66 2 441 4380, E-mail: theeraphon.pia@mahidol.ac.th https://orcid.org/0000-0001-8975-9520

DOI:

https://doi.org/10.17179/excli2023-6356

Keywords:

antimicrobial resistance, LpxC, QSAR, machine learning, cheminformatics, activity cliff, chemotype

Abstract

Antimicrobial resistance (AMR) has emerged as one of the global threats to human health in the 21st century. Drug discovery of inhibitors against novel targets rather than conventional bacterial targets has been considered an inevitable strategy for the growing threat of AMR infections. In this study, we applied quantitative structure-activity relationship (QSAR) modeling to the LpxC inhibitors to predict the inhibitory activity. In addition, we performed various cheminformatics analysis consisting of the exploration of the chemical space, identification of chemotypes, performing structure-activity landscape and activity cliffs as well as construction of the Structure-Activity Similarity (SAS) map. We built a total of 24 QSAR classification models using PubChem and MACCS fingerprint with 12 various machine learning algorithms. The best model with PubChem fingerprint is the Extremely Gradient Boost model (accuracy on the training set: 0.937; accuracy on the 10-fold cross-validation set: 0.795; accuracy on the test set: 0.799). Furthermore, it was found that the best model using the MACCS fingerprint was the Random Forest model (accuracy on the training set: 0.955; accuracy on the 10-fold cross-validation set: 0.803; accuracy on the test set: 0.785). In addition, we have identified eight consensus activity cliff generators that are highly informative for further SAR investigations. It is hoped that findings presented herein can provide guidance for further lead optimization of LpxC inhibitors.

Published

2023-09-05

How to Cite

Yu, T., Chong, L. . C., Nantasenamat, C., Anuwongcharoen, N., & Piacham, T. (2023). Machine learning approaches to study the structure-activity relationships of LpxC inhibitors. EXCLI Journal, 22, 975–991. https://doi.org/10.17179/excli2023-6356

Issue

Section

Original articles

Categories

Most read articles by the same author(s)

1 2 > >>