Short-length peptides contact map prediction using Convolution Neural Networks
November 14, 2022
December 06, 2022
In this article, it is considered an approach for predicting the contact matrix (contact map) for short-length peptides. Contact matrix is two-dimensional representation of the protein. It can be used for tertiary structure reconstruction or for starting approximation in energy minimization models. For this work, peptides with a chain length from 15 up to 30 were chosen to test the model and simplify the calculations. Convolutional neural networks (CNNs) were used as a prediction tool according to the fact that the feature space of each peptide is presented as a two-dimensional matrix. SCRATCH tool was used to generate the secondary structure, solvent accessibility, and profile matrix (PSSM) for each peptide. CNN was implemented in the Python programming language using the Keras library. To work with the common PDB-format, which presents the structure information of proteins, the BioPython module was used. As a result, training, validation and test samples were generated, the multilayer multi-output convolutional neural network was constructed, which was trained and validated. The experiments were conducted on a test sample to predict the contact matrix and compare it with native one. To assess the quality of prediction, conjunction matrices for the threshold of 8 and 12 $\dot A$ were formed, the metrics F1-score, recall and precision were calculated. According to F1-score, we can observe, that even with small neural network we can acheve quite good results. At the final step FT-COMAR tool was used to reconstruct tertiary structure of the proteins from its contact matrix. The results shows, that for reconstructed structures from 12 threshhold contact matrix, RMSD metric is better.
How to cite
Metadata are provided both in "article" format (very similar to INSPIRE) as this helps creating
very compact bibliographies which can be beneficial to authors and
readers, and in "proceeding" format
which is more detailed and complete.