Comparative analysis of machine learning based methods for the prediction of NLR protein


  • Nadia Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat- 173234, Solan (HP), India
  • Ekta Gandotra Department of Computer Science Engineering &- Information Technology, Jaypee University of Information Technology, Waknaghat- 173234, Solan (HP), India
  • Narendra Kumar Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat- 173234, Solan (HP), India


NLR, machine learning, SVM, SMO, random forest, cross-validation


In intestinal tissue repair and innate immunity, the nucleotide-binding domain leucine-rich repeat-containing (NLR) proteins play a fundamental role. The NLR protein family is a recent addition to the members of innate immunity effector molecules. It also plays an important role in intestinal microbiota, and recently emerged as a crucial hit for the development of colitis-associated cancer (CAC) and ulcerative colitis (UC). We have developed a Machine Learning based method for the prediction of NLR Proteins. This paper presents a comparative analysis of three supervised machine learning algorithms i.e. Sequential Minimal Optimization (SMO), Library for Support Vector Machine (LIBSVM) and Random Forest (RF) for prediction of NLR proteins. The dataset used for this work is created after extracting the features using ProtR package. The models are trained with the input compositional features generated using dipeptide composition, amino acid composition, etc. The dataset employed for training consists of 390 proteins. It has positive (103 sequences) set consisting of sequences from the NLR family and the remaining dataset (287 sequences) act as a negative training set, which has random protein sequences and several transporter family protein sequences retrieved from the NCBI and Uniprot. 


Download data is not yet available.


Adachi, H., Contreras, M. P., Harant, A., Wu, C.-H., Derevnina, L., Sakai, T., . . . Kamoun, S. (2019). An N-terminal motif in NLR immune receptors is functionally conserved across distantly related plant species. eLife, 8, e49956. doi:10.7554/eLife.49956

Agius, R., Brieghel, C., & Andersen, M. A. (2020). Machine learning can identify newly diagnosed patients with CLL at high risk of infection. 11(1), 363. doi:10.1038/s41467-019-14225-8

Amouri, A., Alaparthy, V. T., & Morgera, S. D. (2020). A Machine Learning Based Intrusion Detection System for Mobile Internet of Things. Sensors (Basel), 20(2). doi:10.3390/s20020461

Baggs, E., Dagdas, G., & Krasileva, K. V. (2017). NLR diversity, helpers and integrated domains: making sense of the NLR IDentity. Current Opinion in Plant Biology, 38, 59-67. doi:

Biswas, A., & Kobayashi, K. S. (2013). Regulation of intestinal microbiota by the NLR protein family. Int Immunol, 25(4), 207-214. doi:10.1093/intimm/dxs116

Chen, G. Y. (2014). Role of Nlrp6 and Nlrp12 in the maintenance of intestinal homeostasis. Eur J Immunol, 44(2), 321-327. doi:10.1002/eji.201344135

Fletcher, R. R., Olubeko, O., Sonthalia, H., Kateera, F., Nkurunziza, T., Ashby, J. L., . . . Hedt-Gauthier, B. (2019). Application of Machine Learning to Prediction of Surgical Site Infection. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 2234-2237. doi:10.1109/embc.2019.8857942

Hartmann, S., & Baumert, M. (2019). Improved A-phase Detection of Cyclic Alternating Pattern Using Deep Learning. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 1842-1845. doi:10.1109/embc.2019.8857006

Higashi, K., Sun, G., & Ishibashi, K. (2019). Precise Heart Rate Measurement Using Non-contact Doppler Radar Assisted by Machine-Learning-Based Sleep Posture Estimation. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 788-791. doi:10.1109/embc.2019.8857830

Hirota, S. A., Ng, J., Lueng, A., Khajah, M., Parhar, K., Li, Y., . . . Beck, P. L. (2011). NLRP3 inflammasome plays a key role in the regulation of intestinal homeostasis. Inflammatory Bowel Diseases, 17(6), 1359-1372. doi:10.1002/ibd.21478

Hueso, M., Vellido, A., Montero, N., Barbieri, C., Ramos, R., Angoso, M., . . . Jonsson, A. (2018). Artificial Intelligence for the Artificial Kidney: Pointers to the Future of a Personalized Hemodialysis Therapy. Kidney Dis (Basel), 4(1), 1-9. doi:10.1159/000486394

Jagga, Z., & Gupta, D. (2014). Supervised learning classification models for prediction of plant virus encoded RNA silencing suppressors. PLoS ONE, 9(5), e97446. doi:10.1371/journal.pone.0097446

Jagga, Z., & Gupta, D. (2015). Machine learning for biomarker identification in cancer research - developments toward its clinical application. Per Med, 12(4), 371-387. doi:10.2217/pme.15.5

JM, O. T., & Boylan, G. B. (2019). Machine learning without a feature set for detecting bursts in the EEG of preterm infants. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 5799-5802. doi:10.1109/embc.2019.8856533

Kalita, M. K., Nandal, U. K., Pattnaik, A., Sivalingam, A., Ramasamy, G., Kumar, M., . . . Gupta, D. (2008). CyclinPred: a SVM-based method for predicting cyclin protein sequences. PLoS ONE, 3(7), e2605. doi:10.1371/journal.pone.0002605

Kigka, V. I., Sakellarios, A. I., Tsompou, P., Kyriakidis, S., Siogkas, P., Andrikos, I., . . . Fotiadis, D. I. (2019). Site specific prediction of atherosclerotic plaque progression using computational biomechanics and machine learning. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 6998-7001. doi:10.1109/embc.2019.8856881

Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J, 13, 8-17. doi:10.1016/j.csbj.2014.11.005

Kumar, M., Gromiha, M. M., & Raghava, G. P. (2008). Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins, 71(1), 189-194. doi:10.1002/prot.21677

Levy, A., Stedman, A., Deutsch, E., Donnadieu, F., & Virgin, H. W. (2020). Innate immune receptor NOD2 mediates LGR5(+) intestinal stem cell protection against ROS cytotoxicity via mitophagy stimulation. 117(4), 1994-2003. doi:10.1073/pnas.1902788117

Liao, L., & Schneider, K. M. (2019). Intestinal dysbiosis augments liver disease progression via NLRP3 in a murine model of primary sclerosing cholangitis. 68(8), 1477-1492. doi:10.1136/gutjnl-2018-316670

Molteni, E., Colombo, K., Beretta, E., Galbiati, S., Santos Canas, L. D., Modat, M., & Strazzer, S. (2019). Comparison of Multi-class Machine Learning Methods for the Identification of Factors Most Predictive of Prognosis in Neurobehavioral assessment of Pediatric Severe Disorder of Consciousness through LOCFAS scale. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 269-272. doi:10.1109/embc.2019.8856880

Nadia, & Jayashree, R. (2020). The Human OncoBiome Database: A Database of Cancer Microbiome Datasets. Current Bioinformatics, 15(5), 472-477. doi:

Nudel, J., Bishara, A. M., de Geus, S. W. L., Patil, P., Srinivasan, J., Hess, D. T., & Woodson, J. (2021). Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc, 35(1), 182-191. doi:10.1007/s00464-020-07378-x

Ramana, J., & Gupta, D. (2009). LipocalinPred: a SVM-based method for prediction of lipocalins. BMC Bioinformatics, 10, 445. doi:10.1186/1471-2105-10-445

Ramana, J., & Gupta, D. (2010a). FaaPred: a SVM-based prediction method for fungal adhesins and adhesin-like proteins. PLoS ONE, 5(3), e9695. doi:10.1371/journal.pone.0009695

Ramana, J., & Gupta, D. (2010b). Machine learning methods for prediction of CDK-inhibitors. PLoS ONE, 5(10), e13357. doi:10.1371/journal.pone.0013357

Seo, S. U., Kamada, N., Muñoz-Planillo, R., Kim, Y. G., Kim, D., Koizumi, Y., . . . Núñez, G. (2015). Distinct Commensals Induce Interleukin-1β via NLRP3 Inflammasome in Inflammatory Monocytes to Promote Intestinal Inflammation in Response to Injury. Immunity, 42(4), 744-755. doi:10.1016/j.immuni.2015.03.004

Tamanna, & Ramana, J. (2015). MATEPRED-A-SVM-Based Prediction Method for Multidrug And Toxin Extrusion (MATE) Proteins. Comput Biol Chem, 58, 199-204. doi:10.1016/j.compbiolchem.2015.07.011

Tiwari, P., Colborn, K. L., Smith, D. E., Xing, F., Ghosh, D., & Rosenberg, M. A. (2020). Assessment of a Machine Learning Model Applied to Harmonized Electronic Health Record Data for the Prediction of Incident Atrial Fibrillation. JAMA Netw Open, 3(1), e1919396. doi:10.1001/jamanetworkopen.2019.19396

Viscaino, M., & Cheein, F. A. (2019). Machine learning for computer-aided polyp detection using wavelets and content-based image. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 961-965. doi:10.1109/embc.2019.8857831

Xu, L., Liang, G., Liao, C., Chen, G. D., & Chang, C. C. (2018). An Efficient Classifier for Alzheimer's Disease Genes Identification. Molecules, 23(12). doi:10.3390/molecules23123140

Zahid, A., Li, B., Kombe, A. J. K., Jin, T., & Tao, J. (2019). Pharmacological Inhibitors of the NLRP3 Inflammasome. Front Immunol, 10, 2538. doi:10.3389/fimmu.2019.02538

Zhang, L. (2019). EEG Signals Classification Using Machine Learning for The Identification and Diagnosis of Schizophrenia. Annu Int Conf IEEE Eng Med Biol Soc, 2019, 4521-4524. doi:10.1109/embc.2019.8857946



How to Cite

Nadia, N., Gandotra, E., & Kumar, N. (2022). Comparative analysis of machine learning based methods for the prediction of NLR protein. International Journal of Health Sciences, 6(S8), 5303–5318. Retrieved from



Peer Review Articles