Scope of generative adversarial networks (GANs) in image processing: A review

Sudesh Kumari; Karan Aggarwal

doi:10.53730/ijhs.v6nS6.9664

Authors

Sudesh Kumari
sudesh_30@rediffmail.com
Maharishi Markandeshwar (Deemed to be University), Mullana, Ambala, India | BPS Women University, Khanpur Kalan, Sonepat, India
Karan Aggarwal Maharishi Markandeshwar (Deemed to be University), Mullana, Ambala, India

Keywords:

image to image translation, style transfer, neural networks, image generation

Abstract

Generative Adversarial Network is the topic of interest in today’s research in the field of image processing and computer vision. A basic GAN model was introduced by Ian Goodfellow et al. in 2014. After that advancement in the field of research in GAN models has been application specific. In computer vision and image to image translation GANs are playing very effective role either in the case of face detection and recognition or in image resolution enhancement and image augmentation. This paper represents a concise overview of various GAN models along with their features and applications. Pix2Pix and conditional GAN models work upon paired datasets while other models like cycle GAN, discover GAN, dual GAN, info GAN, deep convolutional GAN etc. work upon unpaired datasets. Various image datasets which are commonly used for training of generator and discriminator networks are also discussed in this paper. Since partial mode collapse is a common problem to occur during training process for all models, therefore various normalization techniques are also preferred during the training of generator and discriminator networks. As the advancements in GAN models are increasing at a very fast rate, soon these models will also be preferred in commercial applications.

Downloads

Download data is not yet available.

References

I. J. Goodfellow, Y. Bengio, and A. Courville, “Deep Learning”, MIT Press, 2016. http://www.deeplearningbook.org.

I. J. Goodfellow, “On distinguishability criteria for estimating generative models”, In International Conference on Learning Representations, ICLR’2015.

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples”, ICLR’2015, arXiv: 1412.6572v3 [stat.ML] 20 March, 2015.

I. J. Goodfellow, J Pouget-Abadie, M Mirza, B Xu, D Warde-Farley, S Ozair, A Courville, Y Bengio, “Generative adversarial nets”, in Advances in neural information processing systems (NIPS’ 2014), pp 2672–2680.

Y. Ganin, V. Lempitsky, “Unsupervised Domain Adaptation by Backpropagation”, arXiv: 1409.7495v2 [stat.ML] 27 Feb 2015.

A. Aggarwal, M. Mittal, G. Battineni, “Generative adversarial network: An overview of theory and applications”, in International Journal of Information Management Data Insights @ Elsevier, 2021.

Deep Learning with Azure , “Generative Adversarial Networks”, August 2018, pp 187-208.

V. Dumoulin and F. Vsin, “A guide to convolution arithmetic for deep learning”, arXiv: 1603.07285v2 [stat.ML] 11 June 2018.

A. Odena, V. Dumoulin and C. Olah, “Deconvolution and Checkerboard Artifacts”, Distill, 2016.

Y. Bengio, E. Thibodeau- Laufer, G. Alain, and J. Yosinski, “Deep generative stochastic networks trainable by backprop.”, In ICML'2014.

D. Kingma and J. Ba, “Adam: A method for stochastic optimization”, arXiv preprint arXiv:1412.6980 (2014).

T. Salimansnet, I. J. Goodfellow, W. Zaremba, V. Cheung, A. Radford and X. Chen, “Improved Techniques for Training GANs”, arXiv:1606.03498v1 [cs.LG] 10 June 2016.

H. Alqahtani, M. Kavakli-Thorne and Gulshan Kumar, “Applications of Generative Adversarial Networks (GANs): An Updated Review”, in Springer Link, 2019.

V. Ringu, A. Eenaja, “A Proposal to use GAN for Speech Recognition in Natural Language Processing”, international journal of creative research thoughts (IJCRT), volume 8, June 2020.

R. Anand, B. Singh and N. Sindhwani, “Speech Perception and Analysis of Fluent Digits’ String using Level-By-Level Time Alignment”, International journal of information technology and knowledge management, volume 2, issue 1, 2009, pp. 65-68.

J. Y. Zhu, K. Philipp, E. Shechtman, A. A. Efros, “Generative Visual Manipulation on the Natural Image Manifold”, in European Conference on Computer Vision (ECCV), 2016.

C. Ledig, L. Theis, F. Huszar, J. Caballero, C. Andrew et al., “Photo-Realistic Single Image Super-Resolution using a Generative Adversarial Network”, arXiv:1609.04802[cs.CV], 2016.

B. Andrew, L. Theodore, J. M. Ritchie, N. Weston, “Neural Photo Editing with Introspective Adversarial Networks”, arXiv: 1609.07093 [cs.LG].

J. Gui, S. Zhenan, W. Yonggang, T. Dacheng, Y. Jieping, “A Rewiew on Generative Adversarial Networks: Algorithms, Theory, and Applications”, arXiv: 2001.06937 [cs.LG], 2020.

G. Bakshi, R. Shukla, N. Sindhwani et al., “An Optimized Approach for Feature Extraction in Multi-Relational Statistical Learning”, Journal of scientific and industrial research (JSIR), volume 80, issue 6, 2021, pp. 537-542.

Y. Skandarani, P-M. Jodoin, A. Lalande, “GANs for Medical Image Synthesis: An Empirical Study”, arXiv:2105.05318v2 [eess. IV] 19 Jul 2021.

K. Aggarwal, M. Singh Bhamrah and H. S. Ryait, “Detection of cirrhosis through ultrasound imaging by intensity difference technique”, EURASIP journal on image and video processing, 2019.

K. Aggarwal, M. Singh Bhamrah and H. S. Ryait, “The identification of liver cirrhosis with modified LBP gray scaling and Otsu binarization”, Springer Plus, 2016.

A. Islam, C. Long, A. Basharat and A. Hoogs, “DOA-GAN: Dual-Order Attention Generative Adversarial Network for Image Copy-Move Forgery Detection and Localization”, proceeding of IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020, pp. 4676-4685.

R. Mehta, K. Aggarwal, D. Koundal et al., “Markov features based DTCWS algorithms for online image forgery detection using ensemble classifier in the pandemic”, International journal on expert systems with applications, December 2021.

I. J. Goodfellow, “NIPS 2016 Tutorial: Generative Adversarial Networks”, arXiv: 1701.00160v4 [cs.LG].

D. P. Kingma, M. Welling, “An Introduction to Variational Autoencoders”, arXiv: 1906.02691v3 [cs.LG], 2019.

D. P. Kingma, “Fast gradient-based inference with continuous latent variable models in auxiliary form”, Technical report, arxiv: 1306.0733 (2013).

D. J. Rezende, S. Mohamed and D. Wierstra, “Stochastic backpropagation and approximate inference in deep generative models”, In ICML'2014, Preprint: arXiv: 1401.4082.

B. J. Frey, “Graphical models for machine learning and digital communication”, MIT Press, 1998.

B. J. Frey, G. E. Hinton, and P. Dayan, “Does the wake-sleep algorithm learn good density estimators?” in proceedings of 8th International conference on Neural Information Processing Systems (NIPS'95), pages 661-670.

A. V. D. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior and K. Kavukcuoglu, “Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016).

A. V. D. Oord, N. Alchbrenner, O. Vinyals, L. Espeholt, A. Graves, K. Kavukcuoglu, “Conditional Image Generation with PixelCNN Decoders”, arXiv:1606.0328 [cs.CV], 2016.

L. Kunhua, Z. Peisi, Y. Zheng, Y. Kaige, M. Liu, “P_VggNet: A convolutional neural network (CNN) with pixel-based attention map”, PLOS ONE, https://doi.org/10.1371/journal.pone.0208497, December 12, 2018.

X. Huang, M. Liu, S. Belongie, J. Kautz, “Multimodal unsupervised image-to-image translation” in Proceedings of the European conference on computer vision (ECCV), 2018, pp 172–189.

M. Egmont-Petersen, D. D. Ridder, H. Handels, “Image processing with neural networks-a review” in journal of Pattern Recognition @ Elsevier Science Ltd, 2002, pp 2279-2301.

L. A. Gatys, S. E. Alexander, M. Bethge, “Image Style Transfer using Convolutional Neural Networks”, in IEEE Explore, 2016, pp 2414-2423.

L. Karacan, Z. Akata, A. Erdem and E. Erdem, “Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts”, arXiv:1612.00215v1 [cs.CV], 1 Dec 2016.

P. Isola, J. Y. Zhu, T. Zhou, A. A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks”, IEEE Explore, 2017, pp 1125-1134.

T. Kim, M. Cha, “Learning to Discover Cross-Domain Relations with Generative Adversarial Networks”, arXiv: 1703.05192v2 [cs.CV], 2017.

Z. Yi, H. Zhang, P. Tan, M. Gong, “Dual GAN: Unsupervised Dual Learning for Image-to-Image Translation”, arXiv:1704.02510 [cs.CV], 2018.

J. Y. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks”, in Proceedings of the IEEE international conference on computer vision, 2017, pp 2223–2232.

J. Y. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”, Computer Vision and Pattern Recognition, 15 November 2018.

P. Isola, J. Y. Zhu, T. Zhou, A. A. Efros, “Image-to-image translation with conditional adversarial networks” In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp 1125–1134.

H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, D. Metaxas, “Stack GAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks”, arXiv:1612.03242 [cs.CV], 2017.

X. Chen, Y. Duan, H. Rein et al., “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets”, arXiv: 1606.03657 [cs.LG], 2016.

A. Radford, L. Metz and S. Chintala, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, arXiv: 1511.06434v2 [cs.LG], 7 Jan 2016.

E. Denton, S. Chintala, A. Szlam, R. Fergus, “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks”, arXiv: 1506.05751[cs.CV], 2015.

A. Nguyen, J. Y. Bengio et al., “Plug and Play Generative Networks: Conditional Iterative Generation of Images in Latent Space”, arXiv: 1612.00005 [cs.CV], 2017.

M. Arjovsky, S. Chintala, L. Bottou, “Wasserstein GAN”, arXiv: 1701.07875 [stat.ML], 2017.

Y. Mrouch, T. Sercu, V. Goel, “McGAN: Mean and Covariance Feature Matching GAN”, arXiv: 1702.08398v2 [cs.LG], 2017.

Y. Choi, M. Choi, S. Kim, M. Kim, J. W. Ha and J. Choo, “Star GAN: Unified Generative Adversarial Networksfor Multi-Domain Image-to-Image Translation”, IEEE Xplore, 2017, pp. 8789-8797.

X. Mao, Q. Li, H. Xie, Y. K. L. Raymond, Z. Wang and S. P. Smolley, “Least Squares Generative Adversarial Networks”, arXiv:16611.04076v3 [cs.CV], 5 April 2017.

J. Li, X. Liang, Y. Wei, T. Xu, J. Feng and S. Yan, “Perceptual Generative Adversarial Networks for Small Object Detection”, arXiv: 1706.05274v2 [cs.CV] 20 June 2017.

H. Liu, P. N. Michelini, D. Zhu, “Artsy–GAN: A style transfer system with improved quality, diversity and performance”, 24th International Conference on Pattern Recognition (ICPR), China, August 20-24, 2018.

Y. Chen, Y. U. Lai, Y. J. Liu, “Cartoon GAN: Generative Adversarial Networks for Photo Cartoonization”, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 9465-9474.

L. Yuan, D. Chen and H. Hu, “Unsupervised Object-Level Image-to-Image Translation using Positional attention Bi-Flow Generative Network”, IEEE Access, volume 7, 2019.

H. Tang, X. Dan, N. Sebe and Y. Yan, “Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation”, arXiv: 1903.12296v3 [cs.CV] 27 Aug 2019.

F. Xiong, Q. Wang and Q. Gao, “Consistent Embedded GAN for Image-to-Image Translation”, IEEE Access, Volume 7, 2019.

T. Karras, S. Laine and T. Aila, “A Style-Based Generator Architecture for Generative Adversarial Networks”, arXiv: 1812.04948v3 [cs.NE] 29 March 2019.

J. Lin, Y. Pang, Y. Xia, Z. Chen and J. Luo, “TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images”, arXiv: 2004.04634v1 [cs.CV] 9 Apr 2020.

Rinartha, K., Suryasa, W., & Kartika, L. G. S. (2018). Comparative Analysis of String Similarity on Dynamic Query Suggestions. In 2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS) (pp. 399-404). IEEE.

Suryasa, I. W., Rodríguez-Gámez, M., & Koldoris, T. (2021). Get vaccinated when it is your turn and follow the local guidelines. International Journal of Health Sciences, 5(3), x-xv. https://doi.org/10.53730/ijhs.v5n3.2938