Empirical comparison and analysis of machine learning-based approaches for druggable protein identification

Authors

  • Watshara Shoombuatong Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700, Phone: +66 2 441 4371; Fax: +66 2 441 4380; E-mail: watshara.sho@mahidol.ac.th https://orcid.org/0000-0002-3394-8709
  • Nalini Schaduangrat Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700 https://orcid.org/0000-0002-0842-8277
  • Jaru Nikom Research Methodology and Data Analytics Program, Faculty of Science & Technology, Prince of Songkla University, Pattani, Thailand, 94000 https://orcid.org/0009-0002-8147-6782

DOI:

https://doi.org/10.17179/excli2023-6410

Keywords:

druggable proteins, sequence analysis, bioinformatics, machine learning, deep learning, ensemble learning

Abstract

Efficiently and precisely identifying drug targets is crucial for developing and discovering potential medications. While conventional experimental approaches can accurately pinpoint these targets, they suffer from time constraints and are not easily adaptable to high-throughput processes. On the other hand, computational approaches, particularly those utilizing machine learning (ML), offer an efficient means to accelerate the prediction of druggable proteins based solely on their primary sequences. Recently, several state-of-the-art computational methods have been developed for predicting and analyzing druggable proteins. These computational methods showed high diversity in terms of benchmark datasets, feature extraction schemes, ML algorithms, evaluation strategies and webserver/software usability. Thus, our objective is to reexamine these computational approaches and conduct a comprehensive assessment of their strengths and weaknesses across multiple aspects. In this study, we deliver the first comprehensive survey regarding the state-of-the-art computational approaches for in silico prediction of druggable proteins. First, we provided information regarding the existing benchmark datasets and the types of ML methods employed. Second, we investigated the effectiveness of these computational methods in druggable protein identification for each benchmark dataset. Third, we summarized the important features used in this field and the existing webserver/software. Finally, we addressed the present constraints of the existing methods and offer valuable guidance to the scientific community in designing and developing novel prediction models. We anticipate that this comprehensive review will provide crucial information for the development of more accurate and efficient druggable protein predictors.

Published

2023-08-29

How to Cite

Shoombuatong, W., Schaduangrat, N., & Nikom, J. (2023). Empirical comparison and analysis of machine learning-based approaches for druggable protein identification. EXCLI Journal, 22, 915–927. https://doi.org/10.17179/excli2023-6410

Issue

Section

Review articles

Categories

Most read articles by the same author(s)