Machine Learning Techniques for Protein Structure Prediction in Bioinformatics
Аннотация
This work analyses and discusses the use of machine learning methods in the field of bioinformatics concerning protein structure prediction. The steps of the proposed methodology include: acquiring and cleaning the relevant proteins sequences datasets; and creation and training of deep learning models, including Convolutional Neural Networks (CNNs), and Recurrent Neural Networks, (RNNs). Overall, adding data augmentation and transfer learning boosts the performance of the models to a mean accuracy of 87%. Predictors Evaluation: In evaluating the predictors, the model used an accuracy of 6% and a Matthews correlation coefficient (MCC) of 0. 74. The generality and efficiency of the proposed model are tested and compared with results from other similar standard datasets. Study shows that the results of the model have higher reliability than the traditional method of the machine-learning algorithm especially in the classification of the second and third levels of varied types of proteins. The research will demonstrate the applicability and possibility of the large-scale and cost-efficient approaches to the protein structure prediction using machine learning, while emphasizing the drawbacks of experimental tools. Recommendations for future work are to enhance the interpretability of the resulting models, better combine methods based on the use of information from hybrid schemes with physical models, and use more extensive databases for enhancing accuracy. This research highlights the profound impact that machine learning can have in the field of bioinformatics as a tool to enhance knowledge of such protein structures and to further practice improvements in the understanding of life processes and a and drug design.
Ҳали таржима қилинмаган