Singleton Review Spam Detection Using Semantic Similarity

Hembram, Punya Prava (2016) Singleton Review Spam Detection Using Semantic Similarity. MTech thesis.

PDF (Full text is restricted upto 03.04.2020)
Restricted to Repository staff only
689Kb

Abstract

Reviews available online plays a vital role in today’s online shopping world. Consumers continuously review the products online. As a result the websites containing the users reviews are being targeted by opinion or review spammers. As the spam reviews are widely spread all over the e-commerce website, the consumers can be easily misguided to buy cheap-quality products. On the other hand the decent stores can be denigrated by false harmful or negative reviews. It has been observed that, in real-life, a very huge portion about more than 90% of the consumers tends to write only a single review. Such kind of reviews that are written by a person who has written only a single review is known as singleton review. These singleton reviews are so immense that they have the power to determine the rating and impression of a store.
So far the existing methods have generally ignored these singleton reviewers. To address such problem, we observe that the same users writes many fake reviews under different profile names and aim to use semantic similarity metrics to compute the relatedness between these one-timer review writers. This method depicts a new point of view towards opinion spam detection. This method is meant to record more subtle information than a simple text similarity measure can capture. It is focused on spam reviewers who use different anonymous profiles to post reviews by again using their previous review that they already wrote, replacing the main feature words with its synonyms, thereby keeping the overall sentiment of the review the same.
The method used here is based on an obvious but important assumption – that the imagination of any human being is limited which also includes the spammers, eventually they will run out of ideas and thus will be unable to write an imaginary experience differently in their reviews every single time. That leads the spammers to very likely reuse the contents of his previous reviews.
In this thesis we discuss a complete different solution to detect review spam written by the one-time review writers by using semantic similarity. It also proposes a modified version of the above mention semantic similarity and tests this hypothesis on the real-life reviews and thereby comparing the proposed method with the existing vectorial similarity models.
Experimental results shows that semantic similarity can outperform the vectorial model in detecting the fraudulent reviews by capturing even more fine textual clues. The precision score of the review classifier showed high results, high enough to make the method viable to be integrated into a production detection system.

Item Type:	Thesis (MTech)
Uncontrolled Keywords:	Opinion Spam; Singleton Reviews; Fake Review Detection; Vectorial Similarity; Semantic Similarity
Subjects:	Engineering and Technology > Computer and Information Science
Divisions:	Engineering and Technology > Department of Computer Science
ID Code:	9106
Deposited By:	Mr. Sanat Kumar Behera
Deposited On:	04 Apr 2018 20:34
Last Modified:	04 Apr 2018 20:34
Supervisor(s):	Babu, Korra Sathya

Repository Staff Only: item control page