Machine learning approaches for sentiment analysis: A survey

P Monika; Chaitanya Kulkarni; N Harish Kumar; S Shruthi; V Vani

doi:10.53730/ijhs.v6nS4.6119

Authors

Monika P
monika.mel@bmsce.ac.in
BMS College of Engineering, Bangalore, India
Chaitanya Kulkarni Dayanada Sagar College of Engineering, Bangalore, India
Harish Kumar N Dayanada Sagar College of Engineering, Bangalore, India
Shruthi S Dayanada Sagar College of Engineering, Bangalore, India
Vani V Nitte Meenakshi Institute of technology, Bangalore, India

Keywords:

sentiment analysis, natural language processing, machine learning, artificial intelligence

Abstract

Sentiment Analysis or Opinion Mining is popular task of Natural Language Processing (NLP) performed on textual data generated by users to know the orientation or sentiment of the text. To perform Sentiment Analysis, it is critical to create an accurate and precise model, machine learning techniques are heavily utilized to build an accurate model. Deep learning and transfer learning techniques have been found to have increased utilization and better results, making them one of the most popular research areas around the world. Hotel and restaurant industries analyze reviews to obtain a deeper understanding of their client’s needs, likes and dislikes, whereas specialists use Twitter data and stock market news items to forecast stock market trends. Machine Learning algorithms are most essential part of a Sentiment Analysis model, this survey paper analyze all the widely used Machine Learning Approaches for Sentiment Analysis. A brief introduction on Methodology for Sentiment Analysis is given along with conclusion and future scope and in the field of Sentiment Analysis.

Downloads

Download data is not yet available.

References

Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Hochreiter, S. and Schmidhuber, J., (1997). Long short-term memory. Neural computation, 9(8), pp.1735-1780.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998- 6008).

Kannan, S., Gurusamy, V., Vijayarani, S., Ilamathi, J., Nithya, M., Kannan, S. and Gurusamy, V., (2014). Preprocessing techniques for text mining. International Journal of Computer Science & Communication Networks, 5(1), pp.7-16.

Etaiwi, W. and Naymat, G., (2017). The impact of applying different preprocessing steps on review spam detection. Procedia computer science, 113, pp.273-279.

Zhang, Y., Jin, R. and Zhou, Z.H., (2010). Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics, 1(1-4), pp.43-52.

Ramos, J., (2003), December. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning (Vol. 242, No. 1, pp. 29-48).

Le, Q. and Mikolov, T., (2014 June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188-1196). PMLR. McCormick, C., (2016.) Word2vec tutorial-the skip-gram model. Apr-2016.[Online]. Available: http://mccormickml. com/2016/04/19/word2vec-tutorial the-skip-gram-model.

Jang, B., Kim, I. and Kim, J.W., (2019). Word2vec convolutional neural networks for classification of news articles and tweets. PloS one, 14(8), p.e0220976.

Arlot, S. and Celisse, A., (2010). A survey of cross-validation procedures for model selection. Statistics surveys, 4, pp.40-79.

Townsend, J.T., (1971). Theoretical analysis of an alphabetic confusion matrix. Perception & Psychophysics, 9(1), pp.40-50.

Agarwal, B. and Mittal, N., (2016). Machine learning approach for sentiment analysis. In Prominent feature extraction for sentiment analysis (pp. 21-45). Springer, Cham.

Tripathy, A., Agrawal, A. and Rath, S.K., (2016). Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications, 57, pp.117-126.

Moraes, R., Valiati, J.F. and Neto, W.P.G., (2013) .Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications, 40(2), pp.621-633.

Murphy, K.P., (2006). Naive bayes classifiers. University of British Columbia, 18(60), pp.1-8. Wright, R.E., (1995). Logistic regression.

Agarwal, B., Poria, S., Mittal, N., Gelbukh, A. and Hussain, A., (2015). Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cognitive Computation, 7(4), pp.487-499.

Ramasamy, L.K., Kadry, S., Nam, Y. and Meqdad, M.N., (2021). Performance analysis of sentiments in Twitter dataset using SVM models. International Journal of Electrical & Computer Engineering (2088-8708), 11(3).

Karanasou, M., Ampla, A., Doulkeridis, C. and Halkidi, M., (2016, December). Scalable and real-time sentiment analysis of twitter data. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (pp. 944-951). IEEE.

Shehu, H.A. and Tokat, S., (2019, April). A hybrid approach for the sentiment analysis of Turkish Twitter data. In The International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 182-190). Springer, Cham.

Chen, Y. and Zhang, Z., (2018, May). Research on text sentiment analysis based on CNNs and SVM. In 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA) (pp. 2731-2734). IEEE.

Jin, C., De-Lin, L. and Fen-Xiang, M.,( 2009, July). An improved ID3 decision tree algorithm. In 2009 4th International Conference on Computer Science & Education (pp. 127-130). IEEE.

Hssina, B., Merbouha, A., Ezzikouri, H. and Erritali, M., (2014). A comparative study of decision tree ID3 and C4. 5. International Journal of Advanced Computer Science and Applications, 4(2), pp.13-19.

Hastie, T., Rosset, S., Zhu, J. and Zou, H., (2009). Multi-class adaboost. Statistics and its Interface, 2(3), pp.349-360.

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.Y., (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, pp.3146- 3154.

Chen, T. and Guestrin, C., (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).

Mikolov, T., Karafiát, M., Burget, L., Černocký, J. and Khudanpur, S., (2010). Recurrent neural network based language model. In Eleventh annual conference of the international speech communication association.

Pascanu, R., Mikolov, T. and Bengio, Y., (2013, May). On the difficulty of training recurrent neural networks. In International conference on machine learning (pp. 1310-1318). PMLR.

Zhou, J., Lu, Y., Dai, H.N., Wang, H. and Xiao, H., (2019). Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM. IEEE Access, 7, pp.38856-38866.

Gandhi, H. and Attar, V., (2020). Extracting aspect terms using CRF and biLSTM models. Procedia Computer Science, 167, pp.2486-2495.

Kumar, J.A. and Abirami, S., (2021). Ensemble application of bidirectional LSTM and GRU for aspect category detection with imbalanced data. Neural Computing and Applications, pp.1-19. Shoryu, T., Wang, L. and Ma, R., (2021, May). A Deep Neural Network Approach using Convolutional Network and Long Short-Term Memory for Text Sentiment Classification. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD) (pp. 763- 768). IEEE.

Muhammad, P.F., Kusumaningrum, R. and Wibowo, A., (2021). Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Procedia Computer Science, 179, pp.728- 735.

Naseem, U., Razzak, I., Musial, K. and Imran, M., (2020). Transformer based deep intelligent contextual embedding for twitter sentiment analysis. Future Generation Computer Systems, 113, pp.58-69.

Javdan, S. and Minaei-Bidgoli, B., (2020, July). Applying transformers and aspect-based sentiment analysis approaches on sarcasm detection. In Proceedings of the Second Workshop on Figurative Language Processing (pp. 67-71).

Zhang, T., Xu, B., Thung, F., Haryono, S.A., Lo, D. and Jiang, L., (2020, September). Sentiment analysis for software engineering: How far can pre-trained transformer models go?. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 70-80). IEEE. Chauhan, C. and Sehgal, S., (2017, May). Sentiment analysis on product reviews. In 2017 International Conference on Computing, Communication and Automation (ICCCA) (pp. 26-31). IEEE.

Chen, X., Vorvoreanu, M. and Madhavan, K., (2014). Mining social media data for understanding students’ learning experiences. IEEE Transactions on learning technologies, 7(3), pp.246- 259

El Rahman, S.A., AlOtaibi, F.A. and AlShehri, W.A., (2019, April). Sentiment analysis of twitter data. In 2019 International Conference on Computer and Information Sciences (ICCIS) (pp. 1-4). IEEE.

Gupta, R. and Chen, M., (2020, August). Sentiment Analysis for Stock Price Prediction. In 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) (pp. 213-218). IEEE.

Taj, S., Shaikh, B.B. and Meghji, A.F., (2019, January). Sentiment analysis of news articles: A lexicon based approach. In 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1-5). IEEE.

Neri, F., Aliprandi, C., Capeci, F., Cuadros, M. and By, T., (2012, August). Sentiment analysis on social media. In 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 919-926). IEEE.