Comparative analysis of machine learning based methods for the prediction of NLR protein
Keywords:
NLR, machine learning, SVM, SMO, random forest, cross-validationAbstract
In intestinal tissue repair and innate immunity, the nucleotide-binding domain leucine-rich repeat-containing (NLR) proteins play a fundamental role. The NLR protein family is a recent addition to the members of innate immunity effector molecules. It also plays an important role in intestinal microbiota, and recently emerged as a crucial hit for the development of colitis-associated cancer (CAC) and ulcerative colitis (UC). We have developed a Machine Learning based method for the prediction of NLR Proteins. This paper presents a comparative analysis of three supervised machine learning algorithms i.e. Sequential Minimal Optimization (SMO), Library for Support Vector Machine (LIBSVM) and Random Forest (RF) for prediction of NLR proteins. The dataset used for this work is created after extracting the features using ProtR package. The models are trained with the input compositional features generated using dipeptide composition, amino acid composition, etc. The dataset employed for training consists of 390 proteins. It has positive (103 sequences) set consisting of sequences from the NLR family and the remaining dataset (287 sequences) act as a negative training set, which has random protein sequences and several transporter family protein sequences retrieved from the NCBI and Uniprot.
Downloads
References
Adachi, H., Contreras, M. P., Harant, A., Wu, C.-H., Derevnina, L., Sakai, T., . . . Kamoun, S. (2019). An N-terminal motif in NLR immune receptors is functionally conserved across distantly related plant species. eLife, 8, e49956. doi:10.7554/eLife.49956
Agius, R., Brieghel, C., & Andersen, M. A. (2020). Machine learning can identify newly diagnosed patients with CLL at high risk of infection. 11(1), 363. doi:10.1038/s41467-019-14225-8
Amouri, A., Alaparthy, V. T., & Morgera, S. D. (2020). A Machine Learning Based Intrusion Detection System for Mobile Internet of Things. Sensors (Basel), 20(2). doi:10.3390/s20020461
Baggs, E., Dagdas, G., & Krasileva, K. V. (2017). NLR diversity, helpers and integrated domains: making sense of the NLR IDentity. Current Opinion in Plant Biology, 38, 59-67. doi:https://doi.org/10.1016/j.pbi.2017.04.012
Biswas, A., & Kobayashi, K. S. (2013). Regulation of intestinal microbiota by the NLR protein family. Int Immunol, 25(4), 207-214. doi:10.1093/intimm/dxs116
Chen, G. Y. (2014). Role of Nlrp6 and Nlrp12 in the maintenance of intestinal homeostasis. Eur J Immunol, 44(2), 321-327. doi:10.1002/eji.201344135
Fletcher, R. R., Olubeko, O., Sonthalia, H., Kateera, F., Nkurunziza, T., Ashby, J. L., . . . Hedt-Gauthier, B. (2019). Application of Machine Learning to Prediction of Surgical Site Infection. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 2234-2237. doi:10.1109/embc.2019.8857942
Hartmann, S., & Baumert, M. (2019). Improved A-phase Detection of Cyclic Alternating Pattern Using Deep Learning. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 1842-1845. doi:10.1109/embc.2019.8857006
Higashi, K., Sun, G., & Ishibashi, K. (2019). Precise Heart Rate Measurement Using Non-contact Doppler Radar Assisted by Machine-Learning-Based Sleep Posture Estimation. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 788-791. doi:10.1109/embc.2019.8857830
Hirota, S. A., Ng, J., Lueng, A., Khajah, M., Parhar, K., Li, Y., . . . Beck, P. L. (2011). NLRP3 inflammasome plays a key role in the regulation of intestinal homeostasis. Inflammatory Bowel Diseases, 17(6), 1359-1372. doi:10.1002/ibd.21478
Hueso, M., Vellido, A., Montero, N., Barbieri, C., Ramos, R., Angoso, M., . . . Jonsson, A. (2018). Artificial Intelligence for the Artificial Kidney: Pointers to the Future of a Personalized Hemodialysis Therapy. Kidney Dis (Basel), 4(1), 1-9. doi:10.1159/000486394
Jagga, Z., & Gupta, D. (2014). Supervised learning classification models for prediction of plant virus encoded RNA silencing suppressors. PLoS ONE, 9(5), e97446. doi:10.1371/journal.pone.0097446
Jagga, Z., & Gupta, D. (2015). Machine learning for biomarker identification in cancer research - developments toward its clinical application. Per Med, 12(4), 371-387. doi:10.2217/pme.15.5
JM, O. T., & Boylan, G. B. (2019). Machine learning without a feature set for detecting bursts in the EEG of preterm infants. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 5799-5802. doi:10.1109/embc.2019.8856533
Kalita, M. K., Nandal, U. K., Pattnaik, A., Sivalingam, A., Ramasamy, G., Kumar, M., . . . Gupta, D. (2008). CyclinPred: a SVM-based method for predicting cyclin protein sequences. PLoS ONE, 3(7), e2605. doi:10.1371/journal.pone.0002605
Kigka, V. I., Sakellarios, A. I., Tsompou, P., Kyriakidis, S., Siogkas, P., Andrikos, I., . . . Fotiadis, D. I. (2019). Site specific prediction of atherosclerotic plaque progression using computational biomechanics and machine learning. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 6998-7001. doi:10.1109/embc.2019.8856881
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J, 13, 8-17. doi:10.1016/j.csbj.2014.11.005
Kumar, M., Gromiha, M. M., & Raghava, G. P. (2008). Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins, 71(1), 189-194. doi:10.1002/prot.21677
Levy, A., Stedman, A., Deutsch, E., Donnadieu, F., & Virgin, H. W. (2020). Innate immune receptor NOD2 mediates LGR5(+) intestinal stem cell protection against ROS cytotoxicity via mitophagy stimulation. 117(4), 1994-2003. doi:10.1073/pnas.1902788117
Liao, L., & Schneider, K. M. (2019). Intestinal dysbiosis augments liver disease progression via NLRP3 in a murine model of primary sclerosing cholangitis. 68(8), 1477-1492. doi:10.1136/gutjnl-2018-316670
Molteni, E., Colombo, K., Beretta, E., Galbiati, S., Santos Canas, L. D., Modat, M., & Strazzer, S. (2019). Comparison of Multi-class Machine Learning Methods for the Identification of Factors Most Predictive of Prognosis in Neurobehavioral assessment of Pediatric Severe Disorder of Consciousness through LOCFAS scale. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 269-272. doi:10.1109/embc.2019.8856880
Nadia, & Jayashree, R. (2020). The Human OncoBiome Database: A Database of Cancer Microbiome Datasets. Current Bioinformatics, 15(5), 472-477. doi:http://dx.doi.org/10.2174/1574893614666190902152727
Nudel, J., Bishara, A. M., de Geus, S. W. L., Patil, P., Srinivasan, J., Hess, D. T., & Woodson, J. (2021). Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc, 35(1), 182-191. doi:10.1007/s00464-020-07378-x
Ramana, J., & Gupta, D. (2009). LipocalinPred: a SVM-based method for prediction of lipocalins. BMC Bioinformatics, 10, 445. doi:10.1186/1471-2105-10-445
Ramana, J., & Gupta, D. (2010a). FaaPred: a SVM-based prediction method for fungal adhesins and adhesin-like proteins. PLoS ONE, 5(3), e9695. doi:10.1371/journal.pone.0009695
Ramana, J., & Gupta, D. (2010b). Machine learning methods for prediction of CDK-inhibitors. PLoS ONE, 5(10), e13357. doi:10.1371/journal.pone.0013357
Seo, S. U., Kamada, N., Muñoz-Planillo, R., Kim, Y. G., Kim, D., Koizumi, Y., . . . Núñez, G. (2015). Distinct Commensals Induce Interleukin-1β via NLRP3 Inflammasome in Inflammatory Monocytes to Promote Intestinal Inflammation in Response to Injury. Immunity, 42(4), 744-755. doi:10.1016/j.immuni.2015.03.004
Tamanna, & Ramana, J. (2015). MATEPRED-A-SVM-Based Prediction Method for Multidrug And Toxin Extrusion (MATE) Proteins. Comput Biol Chem, 58, 199-204. doi:10.1016/j.compbiolchem.2015.07.011
Tiwari, P., Colborn, K. L., Smith, D. E., Xing, F., Ghosh, D., & Rosenberg, M. A. (2020). Assessment of a Machine Learning Model Applied to Harmonized Electronic Health Record Data for the Prediction of Incident Atrial Fibrillation. JAMA Netw Open, 3(1), e1919396. doi:10.1001/jamanetworkopen.2019.19396
Viscaino, M., & Cheein, F. A. (2019). Machine learning for computer-aided polyp detection using wavelets and content-based image. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 961-965. doi:10.1109/embc.2019.8857831
Xu, L., Liang, G., Liao, C., Chen, G. D., & Chang, C. C. (2018). An Efficient Classifier for Alzheimer's Disease Genes Identification. Molecules, 23(12). doi:10.3390/molecules23123140
Zahid, A., Li, B., Kombe, A. J. K., Jin, T., & Tao, J. (2019). Pharmacological Inhibitors of the NLRP3 Inflammasome. Front Immunol, 10, 2538. doi:10.3389/fimmu.2019.02538
Zhang, L. (2019). EEG Signals Classification Using Machine Learning for The Identification and Diagnosis of Schizophrenia. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 4521-4524. doi:10.1109/embc.2019.8857946
Published
How to Cite
Issue
Section
Copyright (c) 2022 International journal of health sciences
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the International Journal of Health Sciences (IJHS) are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJHS right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.
Articles published in IJHS can be copied, communicated and shared in their published form for non-commercial purposes provided full attribution is given to the author and the journal. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
This copyright notice applies to articles published in IJHS volumes 4 onwards. Please read about the copyright notices for previous volumes under Journal History.