A comparative evaluation of sentiment analysis techniques on Facebook data using three machine learning algorithms: Naïve Bayes, maximum entropy and support vector machines
The rapid growth and popularity of social networks has led to the creation of vast amounts of textual data often in an unstructured, fragmented and informal form. Huge volumes of electronic data in form of reviews, customer feedback, elicited surveys, unsolicited comments, suggestions and criticisms are generated on a daily basis which makes it difficult for institutions, government bodies, companies and prospective organizations to react to feedback quickly due to the inadequate capacity to handle the volumes. While recent NLP-based sentiment analysis has centered around Twitter and product or service reviews, we believe it is possible to more accurately classify the emotion in Facebook status messages due to their nature. Facebook status messages are more concise than reviews and tweets, thus allowing for more characters to be used which means better writing and a more accurate portrayal of emotions. In this study, we perform Sentiment Analysis on Facebook by fetching the posts and extracting their content. We then tokenize the data in order to extract their keyword combinations and perform feature selection to keep only the n-grams that are important for the classification problem. We finally train our classifier to identify the polarity of the posts i.e. whether positive, negative or neutral. We analyze the suitability of various approaches to NLP sentiment analysis by comparing the performance of the Naïve Bayes Classifier, Maximum Entropy Classifier and Support Vector Machines. We notice that feature selection technique has a significant impact on the performance of the algorithm. The presence of trigram and bigram information produced better results with all the three algorithms compared to unigrams. This is attributed to the fact that trigrams and bigrams are better at capturing sentiment patterns unlike unigrams which just provide a good coverage of the data. Trigrams achieved an overally higher performance in all instances giving an accuracy of 82.6% with unigrams achieving the least accuracy of 73.8%. However, as statements became long and winded with contradictory phrases, the classifiers performed poorly. This means therefore, that feature selection method alone is not enough to determine the performance of an algorithm. Some advanced NLP techniques might be required to deal with this shortcoming.