Analysis of microarray data by genetic algorithm
Keywords:
genetic algorithm, grouping genes, classification of microarray samples, candidate genes for disease, microarray data, entropy, mutual information, optimizationAbstract
Microarray gene expression data is used to understand the actions of thousands of genes. Just a few genes out of thousands have a significant impact in any cancer process. Finding these defective genes using experimental data is impractical. To locate the relevant genes, computational techniques are required. A method to identifying cancer candidate genes from microarray data is created. Clustering of similar genes is necessary to find co-expressed genes in different biological conditions. It is important to develop methods to find the few candidate genes for cancer. An optimization process is used for such purpose. A genetic algorithm employs the principles of evolution: selection, recombination, and mutation to solve an optimization problem. Mutual information is used to find the dependency between genes. The two genes are similar if their expression levels are comparable. The similarity as well as positive and negative correlations between genes is considered while clustering them. Interdependence measure tells how the genes are correlated. The genes responsible for a sick state have higher interdependence measures. These genes are defective genes having cancer diagnostic information. Here microarray gene expression datasets from gastric cancer and colon cancer from the public domain are considered.
Downloads
References
Alanni R et al (2019). Deep gene selection method to select genes from microarray datasets for cancer classification, BMC Bioinformatics, 20, Article number 608.
Alon, U., et al. (1999) Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proc. Nat’l Academy of Sciences of the United States of America 96(12): 6745-6750
Amaral M L et al (2018) BART: bioinformatics array research tool. BMC Bioinformatics, 19, article no 296.
Au, W. et al. (2005) Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(2): 83 – 101.
Aydadenta H and Adiwijaya (2017) On the classification techniques in data mining for microarray data classification. Journal of physics: Conference series 971, 012004.
Behera N & Nanjundiah V (1995) An Investigation into the Role of Phenotypic Plasticity in Evolution / Journal of Theoretical Biology Vol.172, No. 3, 225-234.
Behera N & Nanjundiah V (1996) The Consequence of phenotypic plasticity in cyclically varying environments: a genetic algorithm study / Journal of Theoretical biology, Vol.178, No.2, 135-144
Behera N & Nanjundiah V (1997) trans-Gene Regulation in Adaptive Evolution: a Genetic Algorithm Model / Journal of Theoretical Biology Vol. 188, 153-162.
Behera N & Nanjundiah V (2004) Phenotypic plasticity can potentiate rapid evolutionary change / Journal of Theoretical Biology, 226, 177-184.
Behera, N., Jeevitesh, M., Jose, J., Kant, K., Dey, A. & M Mazher (2017) / Higher accuracy
Behera, N., Sinha, S, Gupta, R, Geoncy, A., Dimitrova, N & Mazher M (2018) Analysis of gene expression data by evolutionary clustering algorithm IEEE Explore (DOI 10.1109/ICIT.2017.41 in 2018)
Behera, Narayan. 1997. "Effect of phenotypic plasticity on adaptation and evolution: a genetic algorithm analysis." Current Science 73:968-976
Berrar, D. P., Dubitzky, W & Granzow, M (2003) A Practical approach to Micro-array data analysis,Kluwer Academic Publishers, London.
Cheng, Y. and Church, G.M. (2000) Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol.Biol,. 8, 93–103.
Cho, S.W., Kim, D H, Uhmn, S, Ko Y W, Cheong J Y and Kim, J (2007) Chronic Hepatitis and Cirrhosis Classification Using SNP Data, Decision Tree and Decision Rule. ICCSA (3): 585-596.
Cios, K.J., and Kurgan, L., (2004) Discretization Algorithm that Uses Class-Attribute Interdependence Maximization. IEEE/ACM Transactions on Knowledge and Data Engineering 16(2): 145 -153(2004).
discretization of continuous data. Intelligent Data Analysis 8(2): 151-170.
Eisen, M.B., Spellman P T, Brown P O, and Botstein D (1998) Cluster analysis and display of genome- wide expression patterns. Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8
Elomaa, T.(1994) In defense of C4.5: Notes on learning one-level decision trees, Proc. of the 11th Int. Conf.on Machine Learning, Morgan Kaufmann, 62- 69.
Gandamayu, I. B. M., Antari, N. W. S., & Strisanti, I. A. S. (2022). The level of community compliance in implementing health protocols to prevent the spread of COVID-19. International Journal of Health & Medical Sciences, 5(2), 177-182. https://doi.org/10.21744/ijhms.v5n2.1897
Goldberg, D E (2008) Genetic algorithms in Search, Optimization and Machine learning, Pearson Education, India
Golub, T.R., et al. (1999) Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. PNAS 96, 2907-2912.
Hambali M A et al (2020) Microarray cancer feature selection: Review, challenges and research directions. International journal of cognitive computing in Engineering, Vol 1, pages 78-97.
Herbola A et al (2022) Chapter 27- Bioinformatics and biological data mining. Bioinformatics: methods and applications, Academic press, 457-471.
Heyer, L.J., Kruglyak S and Yooseph S (1999) Exploring expression data: identification and analysis of coexpressed genes. Genome Research 9, 1106-1115.
Holland, J H (1975) Adaptation in natural and artificial systems, University of Michigan Press, Ann Arbor, MI, USA.
Johnson S. C., (1967) Hierarchical Clustering Schemes Psychometrika, 2:241-254.
Kohonen, T., (1990) The self-organizing map Proc. IEEE 78,1464-1479.
Koschmieder A et al (2012) Tools for managing and analyzing microarray data. Briefings in Bioinformatics, 13, 46-60.
Krishna K, Murty M (1999) Genetic K-means algorithm. IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics 29:433-439.
Liu et al (2009) Computational data mining in cancer bioinformatics and cancer Epidemiology. BioMed Research international. DOI.org//1155//2009//582697.
Liu, Y., Shen M, Wen J F, and Hu Z L (2006) Expressions of TGIF, MMP9 and VEGF proteins and their clinicopathological relationship in gastric cancer. PUBMED Feb;31(1):70-4.
Lloyd, S.P., (1982) Least Squares Quantization in PCM. IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129-137
protein multiple sequence alignment by genetic algorithm, Procedia Computer Science, Vol
Quinlan, J.R. (1993) C4.5: Programs for machine learning. Morgan Kaufman, San Francisco.
Razak K (2013) Application in the domain of data mining. Indian Journal of Computer Science and Engineering, 1, 114-118.
Selvaraj S and Natarajan J (2011) Microarray data analysis and mining tools. Bioinformation, 6, 95-99.
Solmaz M et al (2019) Graphical data mining of cancer mechanisms with SEMA. Bioinformatics, 35, 4413-4418.
Suryasa, I. W., Rodríguez-Gámez, M., & Koldoris, T. (2021). The COVID-19 pandemic. International Journal of Health Sciences, 5(2), vi-ix. https://doi.org/10.53730/ijhs.v5n2.2937
Tsutsumi, S., Hippo Y, Taniguchi H, and Machida N (2002) Global gene expression analysis of gastric cancer by oligonucleotide microarrays. Cancer Res Jan 1;62(1):233-40.
Wang, L., Zhu J S, Song M Q, Chen G Q, and Chen J L(2006) Comparison of gene expression profiles between primary tumor and metastatic lesions in gastric cancer patients using laser microdissection and cDNA microarray. World J Gastroenterol November;12(43):6949-6954.
Wong, A.K.C., Liu, L L and Yang W (2004) A global optimal algorithm for class-dependent
Zhang Y et al (2009) Bioinformatics analysis of microarray data. Methods Mol. Biol. 573, 259-284
Published
How to Cite
Issue
Section
Copyright (c) 2022 International journal of health sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the International Journal of Health Sciences (IJHS) are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJHS right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.
Articles published in IJHS can be copied, communicated and shared in their published form for non-commercial purposes provided full attribution is given to the author and the journal. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
This copyright notice applies to articles published in IJHS volumes 4 onwards. Please read about the copyright notices for previous volumes under Journal History.