Prediction of N-Gram Language Models Using Sentiment Analysis on Social Campaigning Programme

Abstract

Sentiment Analysis is the branch of the study of Natural Language Processing. The core concept of sentiment analysis is to identify and learn insights from the text or sentences considered to be reviews or opinions about a product or service. These opinions are collected from any platforms like social media, online surveys, online product selling applications, and blogs, etc. The process of sentiment analysis roughly starts by collecting the reviews or opinions, pre-processing of text or sentences, classifying the text to find the polarity whether it is found to be as positive, negative, or neutral. The main objective of this research work is to apply sentiment analysis to the e-learning review dataset. To attain the above-said objective, we predict which n-gram model best suits in feature extraction with machine learning algorithms.

Keywords; E-learning, Sentiment Analysis, Feature Selection, Social

I. INTRODUCTION[1]

Written sentiment express shows the mood of their author or her view considering a specific entity. The web is rich in such expressions as Web 2.0 technologies allow common users to comment and post online their views about anything. Sentiment analysis over social media enables the extraction of essential conclusions regarding the average mass opinion on a diversity of topics though it brings severe technical challenges [1]. This is due to the sparse, noisy, multilingual content that is posted online by users of social media. Regarding e-learning, sentiment analysis engages in applying the automatic text analysis process for the motive of extracting views and acknowledging the wide array of sentiments expressed in the e-learning blogs and forums via which learners are discussing or defining the individual opinions and assessments based on the offered services. The current techniques depend on recognizing patterns in the free-text that expresses an emotion. These patterns entail character n-grams that are grounded on the n-grams model. The paper seeks to predict the best fit n-gram model for individuals to find out how people choose their best e-learning providers through their positive opinions and give negative reviews that the e-learning providers should improve by applying the concept of Natural Language Processing.

Don't use plagiarised sources.Get your custom essay just from $11/page

GET CUSTOM PAPER

II. RELATED WORKS

E-learning is always linked with distance learning. It is likely to define e-learning [2] as a form of education, whereby students work at home and converse with teachers and other learners through e-mail, videoconferencing, electronic forum, bulletin boards alongside other computer-oriented communication means.

For any e-learning system to be thriving, it has to categorize particular conditions and some characteristics such as flexibility that allows the system to adapt to the capabilities and goals of each user including tutor and learner [3], interactivity [4] and solid infrastructure to support the system and offer the learners with an easy and fast access to the system.

Pang [5] clarified and presented a pertinent work grounded on classic topic categorization approaches. The proposed approach strives to test if a chosen group of machine learning algorithms can positively generate good outcomes when sentiment analysis is viewed as a document topic assessment bearing duo topics: negative and positive. In fact, [5] shows the outcomes for experiments by the mode of Maximum Entropy, Naïve Bayes, and Support Vector Machine algorithms. Fascinatingly, the conducted tests have demonstrated results comparable to other solutions spanning from 71% to 85%, relying on the technique and test datasets.

Equivalently, [6] has applied these similar classifiers to categorize film reviews and document blogs covering movies and vehicles grounded on particular unique characteristics (unigrams, unigrams + subjectivity, bigrams, and adjectives). In fact, the highest precision is attained by Maximum Entropy alongside the features unigrams + subjectivity. Also, Naïve Bayes NB has justified being the quickest. [7] evaluates the polarity of a film review being either negative or positive through the subsequent features bigrams, unigrams, trigrams, dependency relationships (subject-verb, verb-object), and through the polarity of the adjective. It has indicated that in addition to trigrams and bigrams, the incorporation of the subjective characteristics enhances the accuracy. However, it turns out that filtering the objective characteristics does not enhance the attained outcomes. As for [8], they have highlighted some experiments with distinct machine learning algorithms using a diversity of online product reviews. The outcomes of the experiment have demonstrated that a discriminating classifier integrated with high-rank n-grams as characteristics could attain better performance.

However, [9] integrates the basic-rule classification, for instance, regulations, generation algorithms, and statistics, and supervised learning vector machine in a new combined technique. Notably, this method has been tested on film reviews and products from MySpace site. The attained results have demonstrated that a hybrid classification can enhance the effectiveness of the categorization. [10] Presents a new technique to sentiment classification grounded on the integration of Hidden Markov Models (HMM) and the Support Vector Machines (SVM). Therefore, the outcomes of the experiment have demonstrated the combination approaches with distinct combining regulations outperform the personal classifiers. [11] has applied three supervised machine learning algorithms of SVM, and the feature-grounded N-gram model to the online reviews regarding the seven famous travel destinations in the world. [11] has demonstrated that the SVM model and Naïve Bayes Method [12] has attained a lower performance than the feature-based N-gram model [13].

III. PROPOSED WORK

This paper is designed to recommend a learning-oriented sentiment classification algorithm to categorize the learner opinion about the e-learning system service into positive and negative to enhance its performance. In this work, three traditional feature selection methods (IG, MI, and CHI) have been explored and advanced alongside the proper HMM and SVM-oriented hybrid learning method [14] [15] [16]. Experiments have been carried out on an e-learning corpus with a size off 2000 document.

IV. METHODOLOGY

A sentiment classification strives at assigning a category to a document from a predefined sequence of categories. The predefined category constitutes of some sentiment classes of negative and positive, a key distinction from the topic-oriented text classification. In supervised machine learning, a trained statistical classifier is deployed for sentiment classification [17]. The trained classifier forecasts the sentiment orientation of input documents. A standard bag of features structure was used to adopt the machine learning logarithms. The attempt sought to clarify is a feature selection method thrives during sentiment classification of e-learning reviews and whether the assessment of e-learning blogs can be a difficult task for sentiment classification.

The learning algorithms cannot handle the texts directly. This justifies why a preliminary phase regarded as preprocessing is necessitated. Initially, text information should be processed before the classification. Preprocessing transforms documents into an appropriate representation ready to advance to the classification stage. The comparative documents on pretreatments are those that are pertinent to the stemming and feature selection [18]. The test corpus and training are pre-processed in a similar manner.

Stop words regarded as noise words are not very important in the pretext of sentiment classification. Here, a list of words has been established constituting primarily the English pronouns [19], special characters, and numbers.

V. CONCLUSION

As illuminated, employing a sentiment analysis to assess the state and structure of web forums and e-learning blogs turns out to be a significant endeavor. However, the present accuracy is promising for effective assessment of forum conversation sentiments. Such an assessment can help understand the opinions of the users about the e-learning system for the motive of enhancement. The study has shown the hopeful of integrating sentiment classification on e-learning. There are specific challenges linked with mining on e-learning reviews and examining e-learning blogs, which makes it a difficult task and adds to its complexity. It is this factor that was at the origin of the loss of precision. Ultimately, the present paper is opening doors for the data scientists to address future research. In the future, the data scientists should combine some of these feature selections, pre-process refinement, and consider misspelled phrases and apply distinct linguistic approaches [20] in the processes of classification.

VI. REFERENCES

[1] “Sentiment Analysis – A Review”, International Journal of Science and Research (IJSR), vol. 4, no. 12, pp. 1842-1845, 2015. Available: 10.21275/v4i12.nov152437.

[2] Z. Pozgaj, B. Knezevic, “ E-Learning: Survey on Students’ Opinions” Information Technology Interfaces, 2007. ITI 2007. pp: 381 – 386

[3] L. C. Seng, T. T. Hok; “Humanizing E-learning” 2003 International Conference on Cyberworlds, 2003; Singapore..

[4] H. Giroire, F. Le Calvez, G. Tisseau; “Benefits of knowledge-based interactive learning environments: A case in combinatorics”; Proceedings of the Sixth International Conference on Advanced Learning Technologies, 2006. pp:285-289.

[5] B. Pang, L. Lee and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the conference on empirical methods in natural language processing

(EMNLP 2002) Philadelphia, PA, USA, (2002), pp:79–86.

[6] E. Boiy, P. Hens, K. Deschacht and M.F. Moens. 2007. Automatic Sentiment Analysis in On-line Text. In proceedings ELPUB2007

Conference on Electronic Publishing – Vienna, Austria (June 2007).

[7] V. Ng, S. Dasgupta, and S.M. Niaz Arifin. 2006. Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews Proceedings of the COLING/ACL 2006 Main

Conference Poster Sessions, Sydney, (July 2006), pp:611–618

[8] H. Cui, V. Mittal and M. Datar. 2006. Comparative experiments on sentiment classification for online product reviews. In Proceedings of AAAI-2006.

[9] R. Prabowo and M. Thelwall. 2009. Sentiment analysis: A combined approach. Journal of Informetrics Volume 3, Issue 2, (April 2009), pp:

143-157.

[10] Z. Kechaou, A. Wali, M. Ben Ammar and A. M. Alimi, “Novel Hybrid

Method for Sentiment Classification of movie reviews,” in The 6th

International Conference on Data Mining, 12-15 july 2010 Las Vegas,

Nevada USA, pp: 415-421.

[11] Q. Ye, Z. Zhang, R. Law: Sentiment classification of online reviews to

travel destinations by supervised machine learning approaches. Expert

Syst. Appl. 36(3): 6527-6535 (2009).

[12] H. Binali, V. Potdar, and C. Wu, “A State Of The Art Opinion Mining

And Its Application Domains,” in Proceedings of ICIT09, 2009.

[13] D. Song; H. Lin; Z. Yang; « Opinion Mining in e-Learning System « .

2007 IFIP International Conference on Network and Parallel Computing

– Workshops. pp: 788 – 792

[14] M. F . Porter. 1980. An Algorithm for Suffix Stripping Program, vol 14,

(1980), pp. 130-137.

[15] Y. Yang, , Pedersen, O. Jan (1997). A comparative study on feature selection in text categorization. ICML, 412–420.

[16] L. Galavotti, F. Sebastiani, M. Simi (2000). Feature selection and negative evidence in automated text categorization. In Proceedings of

KDD

[17] “Feature Extraction for Sentiment Classification on Twitter Data”, International Journal of Science and Research (IJSR), vol. 5, no. 2, pp. 2183-2189, 2016. Available: 10.21275/v5i2.nov161677.

[18] “Sentiment Classification using Machine Learning Techniques”, International Journal of Science and Research (IJSR), vol. 5, no. 4, pp. 819-821, 2016. Available: 10.21275/v5i4.nov162724.

[19] Zheng and G. Li, “Identifying Negative Sentiment with Sentiment Based LDA and Support Vector Machine Classification”, International Journal of Control and Automation, vol. 9, no. 9, pp. 331-342, 2016. Available: 10.14257/ijca.2016.9.9.32.

[20] Z. Kechaou, M. Benammar and M. A. Alimi. 2010. A new linguistic approach to sentiment automatic processing. The 9th IEEE International Conference on Cognitive Informatics, 7-9 July, Beijing pp:265-272

[1]

Prediction of N-Gram Language Models Using Sentiment Analysis on Social Campaigning Programme

Pssst… we can write an original essay just for you.