Data preperation and pattern discovery for web usage mining

Bhalla, Karan and Prasad, Deepak (2007) Data preperation and pattern discovery for web usage mining. BTech thesis.

[img]
Preview
PDF
216Kb

Abstract

The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traffic and the size and complexity of Web sites. The complexity of tasks such as
Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis includes straightforward statistics, such as page access frequency, as well as more sophisticated forms of analysis, such as finding the common traversal paths through a Web site. Web Usage Mining is the application of data mining techniques to usage logs of large Web data repositories in order to produce results that can be used in the design tasks mentioned above. However, these server logs cannot be used directly for patter discovery and analysis purposes. There are several preprocessing tasks that must be performed prior to applying data mining algorithms to the data collected from server logs. The objective of this paper is to discuss several data preparation techniques in order to identify unique users and user sessions. New heuristics to identify user sessions have been proposed. Also the data mining algorithms that can be applied to this processed data to discover patterns and rules have been discussed. On the basis of implementation of these algorithms, a comparative analysis among some of these algorithms is drawn on a 2-dimensional graph.

Item Type:Thesis (BTech)
Uncontrolled Keywords:WWW, Web site, Web data
Subjects:Engineering and Technology > Computer and Information Science
Divisions: Engineering and Technology > Department of Computer Science
ID Code:4208
Deposited By:Hemanta Biswal
Deposited On:25 Jun 2012 15:48
Last Modified:25 Jun 2012 15:48
Supervisor(s):Jena, S K

Repository Staff Only: item control page