Study of Distance-Based Outlier Detection Methods

Sethi, Jyoti Ranjan (2013) Study of Distance-Based Outlier Detection Methods. BTech thesis.

[img]
Preview
PDF
715Kb

Abstract

An Outlier is an observation which is dierent from the others in a sample. Usually an anomaly occurs in every data due to measurement error. Anomaly detection is identifying anomalous data for given dataset that does not show normal behavior. Anomaly detection can be classified into three categories: Unsupervised, Supervised and Semisupervised anomaly detection. Anomaly detection is used variety of domains like fault detection, fraud detection, health monitoring system, intrusion detection. The outlier detection can be grouped into 5 main categories: statistical-based approaches, depth-based approaches, clustering approaches, distance-based approaches and density-based approaches. Distance -based methods i.e. Index-based algorithm, Nested-loop algorithm and LDOF are discussed. To reduce the false positive error in LDOF, we proposed MLDOF algorithm. We tested LDOF and MLDOF by implementing on several large and high-dimensional real datasets obtained from UCI machine repository. The experiments show that the MLDOF improves accuracy of anomaly detection with respect to LDOF and reduces the false positive error.

Item Type:Thesis (BTech)
Uncontrolled Keywords:Anomaly; LDOF; Index-based; Nested-Loop.
Subjects:Engineering and Technology > Computer and Information Science > Data Mining
Divisions: Engineering and Technology > Department of Computer Science
ID Code:5182
Deposited By:Hemanta Biswal
Deposited On:10 Dec 2013 16:11
Last Modified:20 Dec 2013 15:51
Supervisor(s):Patra, B K

Repository Staff Only: item control page