Abstract
Prediction of protein structural classes for low homology proteins is a challenging research task in bioinformatics. A dual-layer fuzzy support vector machine (FSVM) network approach is proposed to predict protein structural classes. A protein sample can be represented by nine representation feature vectors: pair couple amino acid (210-D) and eight pseudo amino acid composition vectoers (PseAAC). Eight physicochemical properties of amino acids extracted from AAIndex databank are used to calculate low frequencies of power spectrum density of sequence-order correlation in protein sequence. In the first layer of FSVM network, nine FSVM classifiers are established, which are trained by different protein feature vectors, respectively. The outputs of the first layer are reclassified by FSVM classifier in 2nd layer of the network. The performance of proposed method is validated by low homology (average 25%) dataset covering 1673 proteins. The promising results indicate that the new method may become a useful tool for predicting not only the structural classification of proteins but also their other attributes.
Chapter PDF
Similar content being viewed by others
Keywords
- Power Spectrum Density
- Jackknife Test
- Fuzzy Support Vector Machine
- Biophysical Research Communication
- Pseudo Amino Acid Composition
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: J. Mol. Biol. 247, 536–540 (1995)
Lo Conte, L., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.: Nucl. Acid Res. 30(1), 264–267 (2002)
Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.: Nucl. Acid Res. 32, D226–D229 (2004)
Klein, P., Delisi, C.: Prediction of Protein Structural Class from the Amino Acid Sequence. Biopolymers 25, 1659–1672 (1986)
Cai, Y.D., Liu, X.J., Xu, X., Zhou, G.P.: Support Vector Machines for Predicting Protein Structural Class. BMC Bioinformatics 2, 3–7 (2001)
Cao, Y., Liu, S., Zhang, L., Qin, J., Wang, J., Tang, K.: Prediction of Protein |Structural Class with Rough Sets. BMC Bioinformatics 7, 20 (2006)
Chen, C., Zhou, X., Tian, Y., Zou, X., Cai, P.: Predicting Protein Structural Class with Pseudo Amino Acid Composition and Support Vector Machine Fusion Network. Anal. Biochem. 357, 116–121 (2006)
Chou, K.C., Cai, Y.D.: Predicting Protein Structural Class by Functional Domain Composition. Biochemical and Biophysical Research Communications (Corrigendum: ibid., 2005, Vol.329, 1362) 321, 1007–1009 (2004)
Du, Q.S., Jiang, Z.Q., He, W.Z., Li, D.P., Chou, K.C.: Amino Acid Principal Component Analysis (AAPCA) and Its Applications in Protein Structural Class Prediction. Journal of Biomolecular Structure and Dynamics 23, 635–640 (2006)
Feng, K.Y., Cai, Y.D., Chou, K.C.: Boosting Classifier for Predicting Protein Domain Structural Class. Biochemical and Biophysical Research Communications 334, 213–217 (2005)
Luo, R.Y., Feng, Z.P., Liu, J.K.: Prediction of Protein Structural Class by Amino Acid and Polypeptide Composition. Eur. J. Biochem. 269, 4219–4225 (2002)
Niu, B., Cai, Y.D., Lu, W.C., Zheng, G.Y., Chou, K.C.: Predicting Protein Structural Class with AdaBoost learner. Protein & Peptide Letters 13, 489–492 (2006)
Shen, H.B., Yang, J., Liu, X.J., Chou, K.C.: Using Supervised Fuzzy Clustering to Predict Protein Structural Classes. Biochemical and Biophysical Research Communications 334, 577–581 (2005)
Sun, X.D., Huang, R.B.: Prediction of Protein Structural Classes Using Support Vector Machines. Amino Acids 30, 469–475 (2006)
Xiao, X., Shao, S.H., Huang, Z.D., Chou, K.C.: Using Pseudo Amino Acid Composition to Predict Protein Structural Classes: Approached with Complexity Measure Factor. Journal of Computational Chemistry 27, 478–482 (2006)
Kedarisetti, K.D., Kurgan, L., Dick, S.: Classifier Ensemble s for Protein Structural Class Prediction with Varying Homology. Biochemical and Biophysical Research Communications 348, 981–988 (2006)
Wang, Z.X., Yuan, Z.: How Good is the Prediction of Protein Structural Class by the Component-coupled Method? Proteins 38, 165–175 (2000)
Chou, K.C., Shen, H.B.: Hum-PLoc: A Novel Ensemble Classifier for Predicting Human Protein Subcellular Localization. Biochem. Biophys. Res. Commun. 347, 150–157 (2006)
Nanni, L., Lumini, A.: MppS: An Ensemble of Support Vector Machine Based on Multiple Physicochemical Properties of Amino Acids. Eurocomputing 69, 1688–1690 (2006)
Nanni, L., Lumini, A.: Ensemblator: An Ensemble of Classifiers for Reliable Classification of Biological Data. Pattern Recognition Letters 28, 622–630 (2007)
Peng, Y.H.: A Novel Ensemble Machine Learning for Robust Microarray Data Classification. Computers in Biology and Medicine 36, 553–573 (2006)
Kurgan, L., Homaeian, L.: Prediction of Structural Classes for Protein Sequences and Domain: Impact of Prediction algorithms, Sequence Representation and Homology, and Test Procedures on Accuracy. Pattern Recognition 39, 2323–2343 (2006)
Chou, K.C.: Prediction of Protein Structural Classes and Subcellular Locations. Curr. Protein Peptide Sci. 1, 171–208 (2000)
Kawashima, S., Ogata, H., Kanehisa, M.: AAindex: Amino Acid Index Database. Nucleic Acids Res. 27, 368–369 (1999)
Chou, K.C.: Prediction of Protein Cellular Attributes Using Pseudo Amino Acid Composition. PROTEINS: Structure, Function, and Genetics (Erratum: ibid., 2001, Vol.44, 60) 43, 246–255 (2001)
Cai, Y.D., Liu, X.J., Xu, X.B., Chou, K.C.: Artificial Neural Network Method for Predicting Protein Secondary Structure Content. Computers and Chemistry 26, 347–350 (2002)
Chou, K.C.: Using Pair-coupled Amino Acid composition to Predict Protein Secondary Structure Content. J. Protein Chem. 18, 473–480 (1999)
Liu, H., Wang, M., Chou, K.C.: Low-frequency Fourier Spectrum for Predicting Membrane Protein Types. Biochem. Biophys. Res. Commun. 336, 737–739 (2005)
Liu, H., Yang, J., Wang, M., Xue, L., Chou, K.C.: Using Fourier Spectrum Analysis and Pseudo Amino Acid Composition for Prediction of Membrane Protein Types. The Protein Journal 24, 385–389 (2005)
Zhang, T.L., Ding, Y.S.: Using Pseudo Amino Acid Composition and Binary-tree Support Vector Machines to Predict Protein Structural Classes. Amino Acids (2007) 10.1007/s00726-007-0496-1
Chou, K.C.: Review: Low-frequency Collective Motion in Biomacromolecules and Its Biological functions. Biophysical Chemistry 30, 3–48 (1988)
Chou, K.C.: Low-frequency Resonance and Cooperativity of Hemoglobin. Trends in Biochemical Sciences 14, 212–213 (1989)
Shen, H.B., Chou, K.C.: Ensemble Classifier for Protein fold pattern recognition. Bioinformatics 22, 1717–1722 (2006a)
Shen, H.B., Chou, K.C.: Using Ensemble Classifier to Identify Membrane Protein Types. Amino Acids (2006), 10.1007/s00726-006-0439-2
Shen, H.B., Yang, J., Chou, K.C.: Fuzzy KNN for Predicting Membrane Protein Types from Pseudo Amino Acid Composition. Journal of Theoretical Biology 240, 9–13 (2006)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Abe, S.: Fuzzy LP-SVM for multiClass problems. In: ESANN 2004 proceedings- European symposium on artificial neural networks Bruges (Belgium), 28-30 April 2004 d-side public, pp. 429–434 (2004), ISBN 2-930307-04-8
Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2002)
Chou, K.C., Zhang, C.T.: Review: Prediction of Protein Structural Classes. Critical Reviews in Biochemistry and Molecular Biology 30, 275–349 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, T., Wei, R., Ding, Y. (2007). Using Fuzzy Support Vector Machine Network to Predict Low Homology Protein Structural Classes. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-75286-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)