Hybrid multi-document text summarization via categorization based on BERT deep learning models
Keywords:
Text Summarization, Category_id Score based categorization, BERT, Deep LearningAbstract
Text summarization is the process of employing a system to shorten a document or a collection of documents into brief paragraphs or sentences using various approaches. This paper presents text categorization using BERT to improve summarization task which is a state-of-the-art deep learning language processing model that performs significantly better than all other previous language models. Multi-document summarization (MDS) has got its bottleneck due to lack of training data and varied categories of documents. Aiming in this direction, the proposed novel hybrid summarization B-HEATS (Bert based Hybrid Extractive Abstractive Text Summarization)framework is a combination of extractive summary via categorization and abstractive summary using deep learning architecture RNN-LSTM-CNN to fine-tune BERT which results in the qualitative summary for multiple documents and overcomes out of vocabulary (OOV). The output layer of BERT is replaced using RNN-LSTM-CNN architecture to fine tune which improves the summarization model. The proposed automatic text summarization is compared over the existing models in terms of performance measures like ROUGE metrics achieves high scores as R1 score 43.61, R2 score 22.64, R3 score 44.95 and RL score is 44.27 on Benchmark DUC datasets.
Downloads
References
Milad Moradi, Georg Dorffner, Matthias Samwald,"Deep contextualized embeddings for quantifying the informative content in biomedical text summarization",Computer Methods and Programs in Biomedicine, 2019 DOI: https://doi.org/10.1016/j.cmpb.2019.105117
Jinming Zhao, Ming Liu,Longxiang Gao,Yuan Jin,Lan Du and He Zhao,"SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression",SIGIR,2020 DOI: https://doi.org/10.1145/3397271.3401327
Akanksha Joshi, E. Fidalgo,E. Alegre,Laura Fernández-Robles,"SummCoder: An unsupervised framework for extractive text summarization based on deep auto-encoders",Expert Systems With Applications,vol.129,pp.200-215,2019 DOI: https://doi.org/10.1016/j.eswa.2019.03.045
Deepa Anand and Rupali Wagh,"Effective deep learning approaches for summarization of legal texts",Journal of King Saud University – Computer and Information Sciences,2019 DOI: https://doi.org/10.1016/j.jksuci.2019.11.015
Qasem A. Al-Radaideh and Dareen Q.Bataineh, “ A Hybrid approach for Arabic text summarization Using Domain Knowledge and Genetic algorithms”, Cognitive Computation, March, 2018 DOI: https://doi.org/10.1007/s12559-018-9547-z
Shengli Song,Haitao Huang & Tongxiao Ruan ,"Abstractive text summarization using LSTM-CNN based deep learning",Multimedia Tools and Applications , vol.78,pp.857-875,2019 DOI: https://doi.org/10.1007/s11042-018-5749-3
Nabil Alami, Noureddine En-nahnahi, Said Alaoui Ouatik & Mohammed Meknassi ,"Using Unsupervised Deep Learning for Automatic Summarization of Arabic Documents",Arabian Journal for Science and Engineering , vol.43, pp.7803-7815,2018 DOI: https://doi.org/10.1007/s13369-018-3198-y
Minakshi Tomer & Manoj Kumar,"Improving Text Summarization using Ensembled Approach based on Fuzzy with LSTM",Arabian Journal for Science and Engineering,2020 DOI: https://doi.org/10.1007/s13369-020-04827-6
Zhenrong Deng, Fuxin Ma, Rushi Lan, Wenming Huang, Xiaonan Luo,"A Two-stage Chinese text summarization algorithm using keyword information and adversarial learning",Neurocomputing, in communication, 2020 DOI: https://doi.org/10.1016/j.neucom.2020.02.102
Asad Abdi, Siti Mariyam Shamsuddin, Shafaatunnur Hasan, Jalil Piran,"Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment",Expert Systems with Applications,2018 DOI: https://doi.org/10.1016/j.eswa.2018.05.010
Nabil Alami, Mohammed Meknassi, Noureddine En-nahnahi,"Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning",Expert Systems with Application,2019 DOI: https://doi.org/10.1016/j.eswa.2019.01.037
Arturo Curiel, Claudio Gutiérrez-Soto, José-Rafael Rojano-Cáceres,"An online multi-source summarization algorithm for text readability in topic-based search",Computer Speech & Language, in communication,2020 DOI: https://doi.org/10.1016/j.csl.2020.101143
X. Lin, M. Liu and J. Zhang, "A Top-Down Binary Hierarchical Topic Model for Biomedical Literature," in IEEE Access, vol. 8, pp. 59870-59882, 2020, doi: 10.1109/ACCESS.2020.2983265. DOI: https://doi.org/10.1109/ACCESS.2020.2983265
Rupal Bhargava, Yashvardhan Sharma,"Deep Extractive Text Summarization",Procedia Computer Science,2020 DOI: https://doi.org/10.1016/j.procs.2020.03.191
Shengluan Hou, Ruqian Lu,"Knowledge-guided unsupervised rhetorical parsing for text summarization",Information Systems,2020
Rupal Bhargava, Gargi Sharma, Yashvardhan Sharma,"Deep Text Summarization using Generative Adversarial Networks in Indian Languages",Procedia Computer Science,2020 DOI: https://doi.org/10.1016/j.procs.2020.03.192
Amy J. C. Trappey, Charles V. Trappey, Jheng-Long Wu, Jack W. C. Wang,"Intelligent compilation of patent summaries using machine learning and natural language processing techniques",Advanced Engineering Informatics,2020 DOI: https://doi.org/10.1016/j.aei.2019.101027
Jiang Z, Liu M, Yin Y, Yu H, Cheng Z and Gu Q. Learning from Graph Propagation via Ordinal Distillation for One-Shot Automated Essay Scoring Proceedings of the Web Conference 2021, (2347-2356) DOI: https://doi.org/10.1145/3442381.3450017
J. Jiang et al.,“Enhancements of Attention-Based Bidirectional LSTM for Hybrid Automatic Text Summarization,” in IEEE Access, vol. 9, pp. 123660-123671, 2021. DOI: https://doi.org/10.1109/ACCESS.2021.3110143
Ramesh Nallapati, FeifeiZhai, and Bowen Zhou. 2017. “SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents”. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI Press, 3075–3081.
Farooq Zaman, Matthew Shardlow, Saeed-Ul Hassan, Naif Radi Aljohani, Raheel Nawaz ,"HTSS: A novel hybrid text summarisation and simplification architecture",Information Processing & Management,2020 DOI: https://doi.org/10.1016/j.ipm.2020.102351
Adhika Pramita Widyassari, Supriadi Rustad, Guruh Fajar Shidik, Edi Noersasongko, De Rosal Ignatius Moses Setiadi,"Review of automatic text summarization techniques & methods",Journal of King Saud University - Computer and Information Sciences,2020
Min Yang, Xintong Wang, Yao Lu, Jianming Lv, Chengming Li,"Plausibility-promoting generative adversarial network for abstractive text summarization with multi-task constraint",Information Sciences,2020 DOI: https://doi.org/10.1016/j.ins.2020.02.040
Jiyuan Zheng, Zhou Zhao, Zehan Song, Min Yang, Xiaohui Yan,"Abstractive meeting summarization by hierarchical adaptive segmental network learning with multiple revising steps",Neurocomputing,2020 DOI: https://doi.org/10.1016/j.neucom.2019.10.019
Duy Duc An Bui, Guilherme Del Fiol, John F. Hurdle, Siddhartha Jonnalagadda,"Extractive text summarization system to aid data extraction from full text in systematic review development",Journal of Biomedical Informatics,2016.
Cao, Ziqiang & Li, Wenjie & Li, Sujian & Wei, Furu, “ Improving Multi-Document Summarization via Text Classification”, 2016.
Upadhyay, Abhishek, Javed Khan Ghazala, Balabantaray, Rakesh Chandra, Rautray Rasmita,'Multi-document Summarization Using Deep Learning', 'Intelligent and Cloud Computing', Springer, Year 2021. DOI: https://doi.org/10.1007/978-981-15-5971-6_20
Rush, Alexander & Chopra, Sumit & Weston, Jason. 'A Neural Attention Model for Abstractive Sentence Summarization'. Comput. Sci. , Year 2015. DOI: https://doi.org/10.18653/v1/D15-1044
Yuliska and T. Sakai, 'A Comparative Study of Deep Learning Approaches for Query-Focused Extractive Multi-Document Summarization', 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT), Year 2019 DOI: https://doi.org/10.1109/INFOCT.2019.8710851
Ren P., Z. Chen, Z. Ren, F. Wei., L. Nie., J. Ma. and M.D. Ridjke, 'Sentence Relation for Extractive Summarization with DeepNeural Network'. ACM Transaction on Information System (TOIS), 2018, DOI: https://doi.org/10.1145/3200864
Volume 36 Issue 4, Article No. 39.
D. Park, S. Kim, J. Lee, J. Choo, N. Diakopoulos and N. Elmqvist, "ConceptVector: Text Visual Analytics via Interactive Lexicon Building Using Word Embedding," in IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 361-370, Jan. 2018, doi: 10.1109/TVCG.2017.2744478. DOI: https://doi.org/10.1109/TVCG.2017.2744478
E. Yulianti, R. Chen, F. Scholer, W. B. Croft and M. Sanderson, "Document Summarization for Answering Non-Factoid Queries," in IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 1, pp. 15-28, 1 Jan. 2018, doi: 10.1109/TKDE.2017.2754373. DOI: https://doi.org/10.1109/TKDE.2017.2754373
Published
How to Cite
Issue
Section
Copyright (c) 2022 International journal of health sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the International Journal of Health Sciences (IJHS) are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJHS right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.
Articles published in IJHS can be copied, communicated and shared in their published form for non-commercial purposes provided full attribution is given to the author and the journal. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
This copyright notice applies to articles published in IJHS volumes 4 onwards. Please read about the copyright notices for previous volumes under Journal History.








