Analysis of Audio and Video in an AudioVisual Scene for Feature Extraction

Barla, Abhilasha (2017) Analysis of Audio and Video in an AudioVisual Scene for Feature Extraction. MTech thesis.

[img]PDF (Fulltext is restricted upto 18.01.2020)
Restricted to Repository staff only

1538Kb

Abstract

Audio and visual signals arriving from a common source are detected using a signal-level fusion technique. Human can extract speech signals that they need to understand from a mixture of background noise, interfering sound sources, and reverberation for effective communication.Using only audio information one can identify the speaker, but for efficient detection of speaker, visual information is also considered. However, with the help of visual cues by locating and observing the lip movement voice activity of a speaker can be detected. Similarly, only with the help of audio information voice activity of a speaker can be detected. Therefore intuition says that if audio and video information are used together then speaker voice activity detection is possible better than the individual. We wish to solve a conversational audiovisual correspondence problem: given sets of audio visual signals,decide which audiovisual pairs are consistent and could have come from a single speaker.
In this thesis for audio coming from an Audio-Visual Scene, audio features are extracted by using Mel Frequency Cepstral Coefficients(MFCC). And for that audio source, sound source localization is done by using Generalized Cross-Correlation using Phase Transform(GCC-PHAT) and for video feature extraction, optical flow of video sequence followed by face detection algorithm is performed.

Item Type:Thesis (MTech)
Uncontrolled Keywords:Optical Flow Algorithm; Sound Source Localization; Voice Activity Detection; GCC-PHAT; MFCC; Speech
Subjects:Engineering and Technology > Electronics and Communication Engineering > Signal Processing
Engineering and Technology > Electronics and Communication Engineering > Image Processing
Divisions: Engineering and Technology > Department of Electronics and Communication Engineering
ID Code:8833
Deposited By:Mr. Kshirod Das
Deposited On:15 Mar 2018 16:42
Last Modified:15 Mar 2018 16:42
Supervisor(s):Roy, Lakshi Prosad

Repository Staff Only: item control page