Protein Similarity Comparison Based on Linear Neural Network and Multiple Parameters

Special Issues Editor (Nottingham Tent University, United Kingdom (Great Britain))

The aim of the present study was to develop a new algorithm of protein structure similarity in which the similarity of many aspects of a protein were considered and the object of calculation is protein pdb data file from NMR (Nuclear Magnetic Resonance) not the sequence. Nine parameters (S1~S9) were selected for predicting protein similarity and they were similarities of spatial structure (density), atoms number, amino acids number, amino acid type, proportion of C element, proportion of N element, proportion of O element, spatial position of P atom and spatial position of S atom in the protein respectively. Assume that the relationship of the similarity (S) and S1~S9 was linear, then a linear neural network was used to optimize coefficients in this linear model. More than 500 pairs proteins data which collected from RCSB PDB were used to train this model. The performance of this model was evaluated and compared with BLAST in the end. The coefficients of each variables were obtained by a neural network and the formula of compute the overall similarity of two proteins was: S=0.3198S1+0.0343S2+0.0279S3+0.0618S4+ 0.0653S5+0.1062S6+0.1032S7+0.1477S8+0.1480S9-0.0142, where S1 - S9 are similarities of the spatial structure (density), atom number, amino acid number, amino acid type, the proportion of C element, the proportion of N element, the proportion of O element, spatial position of P atom and spatial position of S atom in the protein respectively. The study presented a new algorithm of protein structure similarity based on multiple parameters and linear neural network and it can be used to compute the structure similarity of any two proteins under conditions which are not applicable to BLASTp.

Journal: International Journal of Simulation: Systems, Science & Technology, IJSSST V17

Published: Jul 14, 2016

DOI: 10.5013/IJSSST.a.17.26.04