Volume 429 - The 6th International Workshop on Deep Learning in Computational Physics (DLCP2022) - Track3. Machine Learning in Natural Sciences
Short-length peptides contact map prediction using Convolution Neural Networks
A.D. Maminov
Full text: pdf
Pre-published on: November 14, 2022
Published on:
Abstract
In this article, it is considered an approach for predicting the contact matrix (contact map) for short-length peptides. Contact matrix is two-dimensional representation of the protein. It can be used for tertiary structure reconstruction or for starting approximation in energy minimization models. For this work, peptides with a chain length from 15 up to 30 were chosen to test the model and simplify the calculations. Convolutional neural networks (CNNs) were used as a prediction tool according to the fact that the feature space of each peptide is presented as a two-dimensional matrix. SCRATCH tool was used to generate the secondary structure, solvent accessibility, and profile matrix (PSSM) for each peptide. CNN was implemented in the Python programming language using the Keras library. To work with the common PDB-format, which presents the structure information of proteins, the BioPython module was used. As a result, training, validation and test samples were generated, the multilayer multi-output convolutional neural network was constructed, which was trained and validated. The experiments were conducted on a test sample to predict the contact matrix and compare it with native one. To assess the quality of prediction, conjunction matrices for the threshold of 8 and 12 $\dot A$ were formed, the metrics F1-score, recall and precision were calculated. According to F1-score, we can observe, that even with small neural network we can acheve quite good results. At the final step FT-COMAR tool was used to reconstruct tertiary structure of the proteins from its contact matrix. The results shows, that for reconstructed structures from 12 threshhold contact matrix, RMSD metric is better.
DOI: https://doi.org/10.22323/1.429.0016
How to cite

Metadata are provided both in "article" format (very similar to INSPIRE) as this helps creating very compact bibliographies which can be beneficial to authors and readers, and in "proceeding" format which is more detailed and complete.

Open Access
Copyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.