Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.
Published in | American Journal of Biomedical and Life Sciences (Volume 5, Issue 6) |
DOI | 10.11648/j.ajbls.20170506.15 |
Page(s) | 135-143 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2017. Published by Science Publishing Group |
Lysine Succinylation, AAC, CKSAAP, Binary Encoding, PSAAP, AAindex, Ensemble Classifier
[1] | B. N. Sobolev, A. V. Veselovsky, and V. V. Poroikov, “Prediction of protein post-translational modifications: main trends and methods,” Russian Chemical Reviews: Russian Academy of Sciences and Turpion Ltd, vol. 83(2), pp. 143-154, 2014. |
[2] | Rosen and R. et al., “Probing the active site of homoserine trans-succinylase,” FEBS Lett., vol. 577, pp. 386-392, 2004. |
[3] | X. Zhao, Q. Ning, H. Chai, and Z. Ma, “Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique,” Journal of Theoretical Biology, vol. 374, pp. 60-65, 2015. |
[4] | H. D. Xu, S. P. Shi, P. P. Wen, and J. D. Qiu, “SuccFind: a novel succinylation sites online prediction tool via enhanced characteristic strategy,” Bioinformatics, vol. 31(23), pp. 3748-3750, 2015. |
[5] | Y. Xu et al., “iSuc-PseAAC: predicting lysine succinylation in proteins by incorporating peptide positionspecific propensity,” Scientific Reports, vol. 5, 2015. |
[6] | J. Jia et al., “iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset,” Analytical Biochemistry, vol. 497, pp. 48-56, 2016. |
[7] | A. M. Hasan, S. Yang, Y. Zhoua, and M. N. H. Mollahb, “SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties,” Molecular BioSystems, vol. 12(3), pp. 786-795, 2016. |
[8] | J. Jia et al., “pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach,” Journal of Theoretical Biology, vol. 394, pp. 223-230, 2016. |
[9] | W. Bao, L. Zhu, and D. S. Huang, “ILSES: Identification lysine succinylation-sites with ensemble classification.” In IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2016. |
[10] | L. Nanni, A. Lumini, and S. Brahnam, “An empirical study of different approaches for protein classification,” The Scientific World Journal, 2014. |
[11] | K. Chen, L. Kurgan, and M. Rahbari, “Prediction of protein crystallization using collocation of amino acid pairs,” Biochemical and Biophysical Research Communications, vol. 355(3), pp. 764-769, 2007. |
[12] | S. Kawashima et al., “AAindex: amino acid index database, progress report 2008,” Nucleic Acids Research, vol. 36(D202-5), 2008. |
[13] | Y. R. Tang, Y. Z. Chen, C. A. Canchaya, and Z. Zhang, “GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network,” Protein Engineering, Design and Selection, vol. 20(8), pp. 405-412, 2007. |
[14] | M. A. M. Hasan, M. Nasser, S. Ahmad and K. I. Molla, “Feature Selection for Intrusion Detection Using Random Forest,” Journal of Information Security, vol. 7, pp. 129-140, 2016. |
[15] | S. Wang and S. Liu, “Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA.” International Journal of Molicular Science, vol. 16(12), pp. 30343-30361, 2015. |
[16] | Y. López et al., “SucStruct: Prediction of succinylated lysine residues by using structural properties of amino acids,” Analytical Biochemistry, vol. 527, pp. 24-32, 2017. |
[17] | The UniProt Consortium, “UniProt: the universal protein knowledgebase,” Nucleic Acids Research; vol. 45, 2016, (D1): D158-D169. doi: 10.1093/nar/gkw1099. |
[18] | Z. Liu et al. “CPLM: a database of protein lysine modifications.” Nucleic Acids Res. Vol. 42, pp. D531–D536, 2016. |
[19] | W. R. Qiu, B. Q. Sun, X. Xiao, Z. C. Xu, K. C. Chou, iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics, 32(20), pp. 3116-3123, 2016. |
[20] | Z. Ju, J. J. He, "Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou’s PseAAC", Journal of Molecular Graphics and Modelling, vol. 76, pp. 356-363, 2017. |
[21] | W. R. Qiu, Q. S. Zheng, B. Q. Sun, X. Xiao, “Multi-iPPseEvo: A Multi‐label Classifier for Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into Chou′ s General PseAAC via Grey System Theory”, Molecular Informatics, 36(3), 2017. |
[22] | H. Long, M. Wang, H. Fu, “Deep Convolutional Neural Networks for Predicting Hydroxyproline in Proteins” Current Bioinformatics, 12(3), pp. 233-238, 2017. |
[23] | M. A. M. Hasan, S. Ahmad, M. K. I. Molla, "iMulti-HumPhos: a multi-label classifier for identifying human phosphorylated proteins using multiple kernel learning based support vector machines", Molecular Bio Systems, vol. 13, pp. 1608-1618, 2017. |
APA Style
Md. Khaled Ben Islam, Md. Nazrul Islam Mondal, Julia Rahman, Md. Al Mehedi Hassan. (2017). DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. American Journal of Biomedical and Life Sciences, 5(6), 135-143. https://doi.org/10.11648/j.ajbls.20170506.15
ACS Style
Md. Khaled Ben Islam; Md. Nazrul Islam Mondal; Julia Rahman; Md. Al Mehedi Hassan. DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. Am. J. Biomed. Life Sci. 2017, 5(6), 135-143. doi: 10.11648/j.ajbls.20170506.15
AMA Style
Md. Khaled Ben Islam, Md. Nazrul Islam Mondal, Julia Rahman, Md. Al Mehedi Hassan. DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. Am J Biomed Life Sci. 2017;5(6):135-143. doi: 10.11648/j.ajbls.20170506.15
@article{10.11648/j.ajbls.20170506.15, author = {Md. Khaled Ben Islam and Md. Nazrul Islam Mondal and Julia Rahman and Md. Al Mehedi Hassan}, title = {DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data}, journal = {American Journal of Biomedical and Life Sciences}, volume = {5}, number = {6}, pages = {135-143}, doi = {10.11648/j.ajbls.20170506.15}, url = {https://doi.org/10.11648/j.ajbls.20170506.15}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajbls.20170506.15}, abstract = {Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.}, year = {2017} }
TY - JOUR T1 - DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data AU - Md. Khaled Ben Islam AU - Md. Nazrul Islam Mondal AU - Julia Rahman AU - Md. Al Mehedi Hassan Y1 - 2017/11/30 PY - 2017 N1 - https://doi.org/10.11648/j.ajbls.20170506.15 DO - 10.11648/j.ajbls.20170506.15 T2 - American Journal of Biomedical and Life Sciences JF - American Journal of Biomedical and Life Sciences JO - American Journal of Biomedical and Life Sciences SP - 135 EP - 143 PB - Science Publishing Group SN - 2330-880X UR - https://doi.org/10.11648/j.ajbls.20170506.15 AB - Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors. VL - 5 IS - 6 ER -