Large-scale comparative review and assessment of computational methods for phage virion proteins identification

Authors

  • Muhammad Kabir School of Systems and Technology, Department of Computer Science, University of Management and Technology, Lahore, Pakistan, 54770 https://orcid.org/0000-0002-2488-1653
  • Chanin Nantasenamat Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700 https://orcid.org/0000-0003-1040-663X
  • Sakawrat Kanthawong Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand, 40002 https://orcid.org/0000-0003-4068-3646
  • Phasit Charoenkwan Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, Thailand, 50200 https://orcid.org/0000-0002-8161-6856
  • Watshara Shoombuatong Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700. Phone: +66 2 441 4371; Fax: +66 2 441 4380; E-mail: watshara.sho@mahidol.ac.th https://orcid.org/0000-0002-3394-8709

DOI:

https://doi.org/10.17179/excli2021-4411

Keywords:

phage virion protein, bioinformatics, classification, machine learning, feature representation, feature select

Abstract

Phage virion proteins (PVPs) are effective at recognizing and binding to host cell receptors while having no deleterious effects on human or animal cells. Understanding their functional mechanisms is regarded as a critical goal that will aid in rational antibacterial drug discovery and development. Although high-throughput experimental methods for identifying PVPs are considered the gold standard for exploring crucial PVP features, these procedures are frequently time-consuming and labor-intensive. Thusfar, more than ten sequence-based predictors have been established for the in silico identification of PVPs in conjunction with traditional experimental approaches. As a result, a revised and more thorough assessment is extremely desirable. With this purpose in mind, we first conduct a thorough survey and evaluation of a vast array of 13 state-of-the-art PVP predictors. Among these PVP predictors, they can be classified into three groups according to the types of machine learning (ML) algorithms employed (i.e. traditional ML-based methods, ensemble-based methods and deep learning-based methods). Subsequently, we explored which factors are important for building more accurate and stable predictors and this included training/independent datasets, feature encoding algorithms, feature selection methods, core algorithms, performance evaluation metrics/strategies and web servers. Finally, we provide insights and future perspectives for the design and development of new and more effective computational approaches for the detection and characterization of PVPs.

Published

2022-01-03

How to Cite

Kabir, M., Nantasenamat, C., Kanthawong, S., Charoenkwan, P., & Shoombuatong, W. (2022). Large-scale comparative review and assessment of computational methods for phage virion proteins identification. EXCLI Journal, 21, 11–29. https://doi.org/10.17179/excli2021-4411

Issue

Section

Review articles

Categories

Most read articles by the same author(s)

1 2 > >>