Harnessing Twitter for Automatic Sentiment Identification

Dash, Amiya Kumar (2015) Harnessing Twitter for Automatic Sentiment Identification. MTech thesis.



Sentiment Analysis is a motivating space of research because of its applications in different fields. Gathering opinions of individuals about products, social and political events, and problems through the web is turning out to be progressively prevalent consistently. People’s opinions are beneficial for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, web blogs, micro-blogs, Twitter and social networks. User generated content on Twitter gives an ample source to gathering individuals’ opinion. Due to the gigantic number of tweets as unstructured text, it is difficult to outline the information physically. Accordingly, proficient computational strategies are required for mining and condensing the tweets from corpuses which, requires knowledge of sentiment bearing words. Many computational methods, models and algorithms are there for identifying sentiment from unstructured text. Most of them rely on machine-learning techniques, using Bag-of-Words (BoW) representation as their basis. In this study, we have used lexicon based approach for automatic identification of sentiment for tweets collected from twitter public domain. We have also applied three different machine learning algorithm (Naive Bayes (NB), Maximum Entropy (ME) and Support Vector Machines (SVM)) for sentiment identification of tweets, to examine the effectiveness of various feature combinations. Our experiments demonstrate that both NB with Laplace smoothing and SVM are effective in classifying the tweets. The feature used for NB are unigram and Part-of-Speech (POS), whereas unigram is used for SVM.

Item Type:Thesis (MTech)
Uncontrolled Keywords:Bag-of-Words (BoW), Lexicon, Machine Learning Algorithms, Laplace Smoothing, Part-of-Speech (POS)
Subjects:Engineering and Technology > Computer and Information Science > Data Mining
Divisions: Engineering and Technology > Department of Computer Science
ID Code:7746
Deposited By:Mr. Sanat Kumar Behera
Deposited On:29 May 2016 11:38
Last Modified:29 May 2016 11:38
Supervisor(s):Jena, S K

Repository Staff Only: item control page