A technical review on knowledge intensive NLP for pre-trained language development
Keywords:
pre-trained NLP, NLP, knowledge-based-NLPAbstract
In today’s world where data plays the very important role, we have various sources of pre-data like online books, equation analysis, encyclopedia, common-sense reasoning, common-sense knowledge, etc. The increasing capacity of pre-training language models have given knowledge intensive natural language processing (KI-NLP) a new boost for advanced functionalities for establishing a stable, flexible, robust and efficient model. Though pre-trained models have its own drawback for handling the KI-NLP tasks, we are here to discuss the challenges faced in this field. A wide variety of pre-trained language models enhanced with external knowledge sources have been proposed and are in rapid development to meet this difficulty. In this research we have also discusses the challenges in NLP in terms of generation of knowledge intensive models. We have also defined some mathematical model and its framework dependability for pre-training different language in NLP. Finally, we have also discussed about variety of literature reviews based on we intend to describe the present progress of pre-trained language model-based knowledge-enhanced models (PLMKEs) in this work by deconstructing their three key elements: information sources, knowledge-intensive NLP tasks, and knowledge fusion methods.
Downloads
References
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. OpenAI blog, page 12.
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature, 521(7553):436–444.
ZIZHENG JI 1, LIN DAI 1, JIN PANG 2, AND TINGTING SHEN,” Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation”, Received April 7, 2020, accepted April 24, 2020, date of publication May 12, 2020, date of current version June 9, 2020.Digital Object Identifier 10.1109/ACCESS.2020.2994247.
Zhilin Yang 1, Zihang Dai 12, Yiming Yang1, Jaime Carbonell1, Ruslan Salakhutdinov1, Quoc V. Le,” XLNet: Generalized Autoregressive Pretraining for Language Understanding”, 19 Jun 2019
Yusuf Arslan, Kevin Allix, Lisa Veiber, Cedric Lothritz, Tegawendé F. Bissyandé, Jacques Klein, and Anne Goujon. 2021. A Comparison of Pre-Trained Language Models for Multi-Class Text Classification in the Financial Domain. In Companion Proceedings of the Web Conference 2021 (WWW ’21 Companion), April 19–23, 2021, Ljubljana, Slovenia. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3442442.3451375.
Wang, Xiaozhi & Gao, Tianyu & Zhu, Zhaocheng & Liu, Zhiyuan & Li, Juanzi & Tang, Jian. (2019). KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation.
Hai Liu,1,2 Yuanxia Liu,1 Leung-Pun Wong,3 Lap-Kei Lee ,3 and Tianyong Hao,” A Hybrid Neural Network BERT-Cap Based on Pre-Trained Language Model and Capsule Network for User Intent Classification”, Volume 2020, Article ID 8858852, 11 pages https://doi.org/10.1155/2020/8858852
Schomacker, T.; Tropmann-Frick, M. Language Representation Models: An Overview. Entropy 2021, 23, 1422. https:// doi.org/10.3390/e23111422
Munazza Zaib, Quan Z. Sheng, and Wei Emma Zhang. 2020. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. In Proceedings of the Australasian Computer Science Week Multiconference (ACSW 2020), February 4–6, 2020, Melbourne, VIC, Australia. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3373017.3373028.
Bonan Min , Hayley Ross , Elior Sulem , Amir Pouran Ben Veyseh ,Thien Huu Nguyen, Oscar Sainz , Eneko Agirre , Ilana Heinz , and Dan Roth, “Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”.
Da Yin1∗ , Li Dong2 , Hao Cheng2 , Xiaodong Liu2 , Kai-Wei Chang1 , Furu Wei2 , Jianfeng Gao2, “A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models”.
Hai Liu,1,2 Yuanxia Liu,1 Leung-Pun Wong,3 Lap-Kei Lee ,3 and Tianyong Hao, “A Hybrid Neural Network BERT-Cap Based on Pre-Trained Language Model and Capsule Network for User Intent Classification”, Volume 2020, Article ID 8858852, 11 pages https://doi.org/10.1155/2020/8858852
Jian Yang, Gang Xiao, Yulong Shen, Wei Jiang, Xinyu Hu, Ying Zhang, Jinghui Peng, “A Survey of Knowledge Enhanced Pre-trained Models”.
Junyi Li1,3† , Tianyi Tang2† , Wayne Xin Zhao1,3∗ and Ji-Rong Wen1,2,3 1 Gaoling, “Pretrained Language Models for Text Generation: A Survey”,2021.
Raghavendra Rao Althar , Debabrata Samanta , Manjit Kaur Abeer Ali Alnuaim, Nouf Aljaffan, and Mohammad Aman Ullah , “Software Systems Security Vulnerabilities Management by Exploring the Capabilities of Language Models Using NLP”, Computational Intelligence and Neuroscience Volume 2021, Article ID 8522839, 19 pages https://doi.org/10.1155/2021/8522839
Bonan Min , Hayley Ross , Elior Sulem , Amir Pouran Ben Veyseh ,Thien Huu Nguyen4, Oscar Sainz, Eneko Agirre, Ilana Heinz, and Dan Roth, “Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”.
Xiaozhi Wang , Tianyu Gao , Zhaocheng Zhu , Zhengyan Zhang , Zhiyuan Liu , Juanzi Li , and Jian Tang , “KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation”.
Xipeng Qiu*, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai & Xuanjing Huang, “Pre-trained Models for Natural Language Processing: A Survey”.
Adithya V Ganesan, Matthew Matero, Aravind Reddy Ravula, Huy Vu, H. Andrew Schwartz,” Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality”, Proc Conf. 2021 June ; 2021: 4515–4532. doi:10.18653/v1/2021.naacl-main.357.
Anthony Gillioz,” Overview of the Transformer-based Models for NLP Tasks”, Proceedings of the Federated Conference on Computer Science and Information Systems pp. 179–183 DOI: 10.15439/2020F20 ISSN 2300-5963 ACSIS, Vol. 21
Cong Sun1, Zhihao Yang1*, Lei Wang2*, Yin Zhang2, Hongfei Lin1 and Jian Wang1,” Deep learning with language models improves named entity recognition for PharmaCoNER”, Sun et al. BMC Bioinformatics (2021) 22:602 https://doi.org/10.1186/s12859-021-04260-y
Donghan Yu1 , Chenguang Zhu2 , Yiming Yang1, Michael Zeng2,” JAKET: JOINT PRE-TRAINING OF KNOWLEDGE GRAPH AND LANGUAGE UNDERSTANDING”, 2 Oct 2020
Fang Liu, Ge Li∗, Yunfei Zhao, Zhi Jin∗,” Multi-task Learning based Pre-trained Language Model for Code Completion”, 29 Dec 2020
Hai Liu,1,2 Yuanxia Liu,1 Leung-Pun Wong,3 Lap-Kei Lee ,3 and Tianyong Hao,” A Hybrid Neural Network BERT-Cap Based on Pre-Trained Language Model and Capsule Network for User Intent Classification”, Volume 2020, Article ID 8858852, 11 pages https://doi.org/10.1155/2020/8858852
Jia Tracy Shen, Michiharu Yamashita, Ethan Prihar, Neil Heffernan, XintaoWu, Ben Graff, Dongwon Lee,” MathBERT:A Pre-trained Language Model for General NLP Tasks in Mathematics Education”, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Workshop on Math AI for Education (MATHAI4ED)
Junyi Li1;3y , Tianyi Tang2y , Wayne Xin Zhao1;3 and Ji-Rong Wen,” Pretrained Language Models for Text Generation: A Survey”, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) Survey Track
Linqing Liu,” Towards Elective Utilization ofPretrained Language Models”, 2020
Ming Zhou, Nan Duan, Shujie Liu, Heung-Yeung Shum,” Progress in Neural NLP: Modeling, Learning, and Reasoning”, Received 30 April 2019 Revised 30 August 2019 Accepted 13 October 2019 Available online 7 January 2020
Nankai Lin, Yingwen Fu, Chuwei Chen, Ziyu Yang, Shengyi Jiang,” LaoPLM: Pre-trained Language Models for Lao”,2020.
Wikipedia, n-Gram, https://en.wikipedia.org/wiki/N-gram, 2021.
Xiaozhi Wang , Tianyu Gao , Zhaocheng Zhu , Zhengyan Zhang , Zhiyuan Liu , Juanzi Li , and Jian Tang , “KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation”.
Patrick S. H. Lewis, Ethan Perez, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In NeurIPS, 2020.
Tom Kwiatkowski, Jennimaria Palomaki, et al. Natural Questions: A Benchmark for Question Answering Research. TACL, 2019.
Zhilin Yang, Peng Qi, et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP, 2018.
US NIST. The Ace Evaluation Plan. US National Institute for Standards and Technology (NIST), 2004.
Published
How to Cite
Issue
Section
Copyright (c) 2022 International journal of health sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the International Journal of Health Sciences (IJHS) are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJHS right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.
Articles published in IJHS can be copied, communicated and shared in their published form for non-commercial purposes provided full attribution is given to the author and the journal. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
This copyright notice applies to articles published in IJHS volumes 4 onwards. Please read about the copyright notices for previous volumes under Journal History.








