Review Spam Detection Using Machine Learning Techniques

Narayan, Rohit (2016) Review Spam Detection Using Machine Learning Techniques. MTech thesis.

[img]PDF (Fulltext is restricted upto 19/08/2019)
Restricted to Repository staff only

980Kb

Abstract

Nowadays with the increasing popularity of internet, online marketing is going to become more and more popular. This is because, a lot of products and services are easily available online. Hence, reviews about these all products and services are very important for customers as well as organizations. Unfortunately, driven by the will for profit or promotion, fraudsters used to produce fake reviews. These fake reviews written by fraudsters prevent customers and organizations reaching actual conclusions about the products. Hence, fake reviews or review spam must be detected and eliminated so as to prevent deceptive potential customers. In our work, supervised and semi-supervised learning technique have been applied to detect review spam. The most apt data sets in the research area of review spam detection has been used in proposed work. For supervised learning, we try to obtain some feature sets from different automated approaches such as LIWC, POS Tagging, N-gram etc., that can best distinguish the spam and non-spam reviews. Along with these features sentiment analysis, data mining and opinion mining technique have also been applied. For semi-supervised learning, PU-learning algorithm is being used along with six different classifiers (Decision Tree, Naive Bayes, Support Vector Machine, k-Nearest Neighbor, Random Forest, Logistic Regression) to detect review spam from the available data set. Finally, a comparison of proposed technique with some existing review spam detection techniques has been done.

Item Type:Thesis (MTech)
Uncontrolled Keywords:Review spam; Opinion mining; Sentiment analysis; Machine learning
Subjects:Engineering and Technology > Computer and Information Science > Information Security
Divisions: Engineering and Technology > Department of Computer Science
ID Code:8587
Deposited By:Mr. Sanat Kumar Behera
Deposited On:20 Aug 2017 15:20
Last Modified:20 Aug 2017 15:20
Supervisor(s):Jena, Sanjay Kumar

Repository Staff Only: item control page