Samantaray, Amiya Kumar (2014) Development of a Real-time Embedded System for Speech Emotion Recognition. BTech thesis.
Speech emotion recognition is one of the latest challenges in speech processing and Human Computer Interaction (HCI) in order to address the operational needs in real world applications. Besides human facial expressions, speech has proven to be one of the most promising modalities for automatic human emotion recognition. Speech is a spontaneous medium of perceiving emotions which provides in-depth information related to different cognitive states of a human being. In this context, we introduce a novel approach using a combination of prosody features (i.e. pitch, energy, Zero crossing rate), quality features (i.e. Formant Frequencies, Spectral features etc.), derived features ((i.e.) Mel-Frequency Cepstral Coefficient (MFCC), Linear Predictive Coding Coefficients (LPCC)) and dynamic feature (Mel-Energy spectrum dynamic Coefficients (MEDC)) for robust automatic recognition of speaker’s emotional states. Multilevel SVM classifier is used for identification of seven discrete emotional states namely angry, disgust, fear, happy, neutral, sad and surprise in ‘Five native Assamese Languages’. The overall experimental results using MATLAB simulation revealed that the approach using combination of features achieved an average accuracy rate of 82.26% for speaker independent cases. Real time implementation of this algorithm is prepared on ARM CORTEX M3 board.
|Item Type:||Thesis (BTech)|
|Uncontrolled Keywords:||LPCC; MFCC; Prosody features; Quality features; Speech Emotion Recognition, Support Vector Machine|
|Subjects:||Engineering and Technology > Electronics and Communication Engineering > Signal Processing|
|Divisions:||Engineering and Technology > Department of Biotechnology and Medical Engineering|
|Deposited By:||Hemanta Biswal|
|Deposited On:||26 Aug 2014 15:49|
|Last Modified:||26 Aug 2014 15:49|
Repository Staff Only: item control page