Plagiarism Detection using Enhanced Relative Frequency Model

Reddy, Kotha Dinesh (2013) Plagiarism Detection using Enhanced Relative Frequency Model. MTech thesis.

[img]
Preview
PDF
964Kb

Abstract

As the world is running towards greater heights of technology, it’s becoming more complex to secure data from being copied. So it’s better to detect the copied contents rather than securing the contents. Here, contents cover digital documents of scientific research, articles in newspapers, journals and assignments submitted by students. There are so many tools and algorithms to detect plagiarism, but the time complexity of the algorithm really matters where document comparison is against giant data set. Vector based methods are quite frequently used in the detection process of plagiarism. There are so many vector based methods, but having some drawbacks. In SCAM approach, selection of 'e'(epcilon) value is a drawback as 'e' value decides the closeness set and daniel approach fails to identify plagiarism when there were repeated terms in a sentence. Here we are proposing a new algorithm, which is developed using the concepts of the Relative Frequency Model overcomes the drawbacks involved in existing methods. In the implementation of our proposed method, we employed sentence splitter, stop-word removal process, and stemming of words.

Item Type:Thesis (MTech)
Uncontrolled Keywords:Information Retrieval; Plagiarism Detection; Copy Detection; Relative Frequency Model; Enhanced Relative Frequency Model;
Subjects:Engineering and Technology > Computer and Information Science > Data Mining
Divisions: Engineering and Technology > Department of Computer Science
ID Code:5401
Deposited By:Hemanta Biswal
Deposited On:19 Dec 2013 11:08
Last Modified:20 Dec 2013 10:31
Supervisor(s):Babu, K S

Repository Staff Only: item control page