A Framework for Vision-based Static Hand Gesture Recognition

Ghosh, Dipak Kumar (2016) A Framework for Vision-based Static Hand Gesture Recognition. PhD thesis.



In today’s technical world, the intellectual computing of a efficient human-computer interaction (HCI) or human alternative and augmentative communication (HAAC) is essential in our lives. Hand gesture recognition is one of the most important techniques that can be used to build up a gesture based interface system for HCI or HAAC application. Therefore, suitable development of gesture recognition method is necessary to design advance hand gesture recognition system for successful applications like robotics, assistive systems, sign language communication, virtual reality etc. However, the variation of illumination, rotation, position and size of gesture images, efficient feature representation, and classification are the main challenges towards the development of a real time gesture recognition system. The aim of this work is to develop a framework for vision based static hand gesture recognition which overcomes the challenges of illumination, rotation, size and position variation of the gesture images. In general, a framework for gesture recognition system which consists of preprocessing, feature extraction, feature selection, and classification stages is developed in this thesis work. The preprocessing stage involves the following sub-stages: image enhancement which enhances the image by compensating illumination variation; segmentation, which segments hand region from its background image and transforms it into binary silhouette; image rotation that makes the segmented gesture as rotation invariant; filtering that effectively removes background noise and object noise from binary image and provides a well defined segmented hand gesture. This work proposes an image rotation technique by coinciding the first principal component of the segmented hand gesture with vertical axes to make it as rotation invariant. In the feature extraction stage, this work extracts xi localized contour sequence (LCS) and block based features, and proposes a combined feature set by appending LCS features with block-based features to represent static hand gesture images. A discrete wavelets transform (DWT) and Fisher ratio (F-ratio) based feature set is also proposed for better representation of static hand gesture image. To extract this feature set, DWT is applied on resized and enhanced grayscale image and then the important DWT coefficient matrices are selected as features using proposed F-ratio based coefficient matrices selection technique. In sequel, a modified radial basis function neural network (RBF-NN) classifier based on k-mean and least mean square (LMS) algorithms is proposed in this work. In the proposed RBF-NN classifier, the centers are automatically selected using k-means algorithm and estimated weight matrix is updated utilizing LMS algorithm for better recognition of hand gesture images. A sigmoidal activation function based RBF-NN classifier is also proposed here for further improvement of recognition performance. The activation function of the proposed RBF-NN classifier is formed using a set of composite sigmoidal functions. Finally, the extracted features are applied as input to the classifier to recognize the class of static hand gesture images. Subsequently, a feature vector optimization technique based on genetic algorithm (GA) is also proposed to remove the redundant and irrelevant features. The proposed algorithms are tested on three static hand gesture databases which include grayscale images with uniform background (Database I and Database II) and color images with non-uniform background (Database III). Database I is a repository database which consists of hand gesture images of 25 Danish/international sign language (D/ISL) hand alphabets. Database II and III are indigenously developed using VGA Logitech Webcam (C120) with 24 American Sign Language (ASL) hand alphabets.

Item Type:Thesis (PhD)
Uncontrolled Keywords:American Sign Language (ASL) hand alphabet, combined features, discrete wavelet transform (DWT), Danish/international sign language (D/ISL) hand alphabet, Fisher Ratio (F-ratio), gesture recognition, genetic algorithm (GA), k-means algorithm; least-mean-square algorithm (LMS), localized contour sequences (LCS), multilayer perceptron back propagation neural network (MLP-BP-NN), radial basis function neural network (RBF-NN), static hand gesture.
Subjects:Engineering and Technology > Electronics and Communication Engineering > Image Processing
Engineering and Technology > Electronics and Communication Engineering > Artificial Neural Networks
Divisions: Engineering and Technology > Department of Electronics and Communication Engineering
ID Code:8052
Deposited By:Mr. Sanat Kumar Behera
Deposited On:03 Nov 2016 21:24
Last Modified:03 Nov 2016 21:24
Supervisor(s):Ari, S

Repository Staff Only: item control page