SENTIMENT ANALYSIS WITH WORD EMBEDDING-THE CASE OF DOUBLE-TRACK EDUCATION SYSTEM IN GHANA

OSCAR BLESSED DEHO, WILLIAM AKOTAM AGANGIBA

Abstract


This paper applies the concept of sentiment analysis for the determination of polarities (positivity, neutrality or negativity) of sentiments borne in the views expressed by Ghanaians regarding the newly introduced double track system in Second Cycle Schools in Ghana. These views are sourced from tweets (twitter posts). Accurate analysis of sentiments depends largely on the context of word usage. Most sentiment analysis approaches however ignore context when predicting sentiments; thereby leading to loss of context. In this paper, the loss of context is avoided with the use of the concept of Word embedding. Word embedding is a context-preserving technique which embeds the contextual information of data in the form of vectors before analysis of sentiment is done. An overall model accuracy of 76% was achieved using this technique. Our model’s accuracy outdoes similar works such as Garg’s (2016) work with an accuracy of 72%. The results from this work may help the Ghana government to get well informed on how the citizenry reacted to the reform of the educational system as well as help those at the helm of affairs to know how to roll out policies in the near future.


Keywords


Word2Vec, Word Embedding, Classifier, Sentiment Analysis

Full Text:

PDF

References


Anim-Appau, F. (2018), “Double-track system: Disadvantages outweigh advantages”, https://www.myjoyonline.com/news/2018/July-26th/double-track-system-disadvantages-outweigh-advantages-educationist.php. Accessed: October 1, 2018.

Annon. (2004), “White Paper on the Report of the Education Reform Review Committee”, Ministry of Education Youth and Sports, Accra, Ghana, pp. 1-2.

Annon. (2016), “Digital in 2016”, https://wearesocial.com/special-reports/digital-in-2016. Accessed: October 16, 2018.

Annon. (2018a), “Internet Users Statistics for Africa- Africa Internet Usage, 2018 Population Stats and Facebook Subscribers”, https://www.internetworldstats.com/stats1.htm. Accessed: October 16, 2018.

Annon. (2018b), “Digital in 2018”, https://digitalreport.wearesocial.com/. Accessed: October 16, 2018.

Chopra, A., Prasha, A. and Sain, C. (2013), “Natural Language Processing”, International Journal of Technological Enhancement and Emerging Engineering & Research, Vol. 1, No. 4, pp. 131-134.

Garg, P. (2016), “Sentiment Analysis of Twitter Data using NLTK in Python”, Thapar University, Patiala, India, 50 pp.

Hu, M., Lui, B. (2004), “Mining and summarizing customer reviews”, University of Illinois at Chicago, Illinois, USA, pp. 1-5.

Hutto, C. and Gilbert, E. E. (2014), “VADER: A parsimonious Rule-based Model for Sentiment Analysis of Social Media Text”, ICWSM-14, Eight International AAAI Conference on Weblogs and Social Media, Michigan, USA, pp. 1-10.

Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013), “Efficient Estimation of Word Representation in Vector Space”, Google Inc, Mountain View, USA, 12 pp.

Partey, A. P. (2018), “Implications of a double-track school calendar on SHS”, https://www.myjoyonline.com/opinion/2018/July-23rd/implications-of-a-double-track-school-calendar-on-shs.php/. Accessed: October 1, 2018.

Pang, B. and Lee, L. (2002), “Thumbs up? Sentiment Classification using Machine Learning Techniques”, EMNLP 2002, Proceedings of the ACL-02 Conference on Empirical methods in natural language processing, Stroudsburg, USA, Vol. 10, pp. 79-86.

Pang, B. and Lee, L. (2008), “Opinion mining and sentiment analysis”, Foundations and Trends in Information Retrieval, Hanover, USA, 135 pp.

Pennington, R., Socher, R. and Manning, D. C. (2013), “GloVe: Global Vectors for Word Representation”, Stanford University, Stanford, USA, 12 pp.

Shamseera, S. P. and Sreekanth, E. S. (2016), “Word Vectors in Sentiment Analysis”, International Journal of Current Trends in Engineering & Research (IJCTER), Vol.2, No. 5, pp. 594-598

Sunil, R. (2017), “An Intuitive Understanding of Word Embeddings: From Count Vectors to Word2Vec”, https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/. Accessed: January 10, 2018.

Turney, P. (2002), “Thumbs up or thumbs down? Semantics orientation applied to unsupervised classification of reviews”, Proceedings of the 40th Annual Meeting on Association for Computational Linguistic, Stroudsburg, USA, pp. 417-424.

Turney, P. and Pantel, P. (2010), “From Frequency to Meaning: Vector Space Models of Semantics”, International Journal of Artificial Intelligence Research, Vol. 37, No. 1, pp. 141-188.

Wilson, T., Wiebe, J. and Hoffman, P. (2005), “Recognizing contextual polarity in phrase level sentiment analysis”, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, Canada, pp. 347-354.

Rong, X. (2016), “Word2Vec Parameter Learning Explained”, arXiv:1411.273v4, 21 pp.

Sahlgren, M. (2008), “The Distributional Hypothesis”, Italian Journal of Linguistics, Vol. 20, No. 1, pp. 20-21.


Refbacks

  • There are currently no refbacks.