Fast and Efficient Foveated Video Compression Schemes for H.264/AVC Platform

Singh, Deepak (2016) Fast and Efficient Foveated Video Compression Schemes for H.264/AVC Platform. PhD thesis.

Preview

PDF
33Mb

Abstract

Some fast and efficient foveated video compression schemes for H.264/AVC platform are presented in this dissertation. The exponential growth in networking technologies and widespread use of video content based multimedia information over internet for
mass communication applications like social networking, e-commerce and education have promoted the development of video coding to a great extent. Recently, foveated imaging based image or video compression schemes are in high demand, as they not only match with the perception of human visual system (HVS), but also yield higher compression ratio. The important or salient regions are compressed with higher visual quality while the non-salient regions are compressed with higher compression ratio. From amongst the foveated video compression developments during the last few years, it is observed that saliency detection based foveated schemes are the keen areas of intense research. Keeping this in mind, we propose two multi-scale saliency detection schemes.
(1) Multi-scale phase spectrum based saliency detection (FTPBSD);
(2) Sign-DCT multi-scale pseudo-phase spectrum based saliency detection
(SDCTPBSD).
In FTPBSD scheme, a saliency map is determined using phase spectrum of a given image/video with unity magnitude spectrum. On the other hand, the proposed SDCTPBSD method uses sign information of discrete cosine transform (DCT) also known as sign-DCT
(SDCT). It resembles the response of receptive field neurons of HVS. A bottom-up spatio-temporal saliency map is obtained by linear weighted sum of spatial saliency map and temporal saliency map.
Based on these saliency detection techniques, foveated video compression (FVC) schemes (FVC-FTPBSD and FVC-SDCTPBSD) are developed to improve the compression performance further.Moreover, the 2D-discrete cosine transform (2D-DCT) is widely used in various video
coding standards for block based transformation of spatial data. However, for directional featured blocks, 2D-DCT offers sub-optimal performance and may not able to efficiently represent video data with fewer coefficients that deteriorates compression ratio. Various directional transform schemes are proposed in literature for efficiently encoding such directional featured blocks. However, it is observed that these directional transform schemes suffer from many issues like ‘mean weighting defect’, use of a large number of DCTs and a number of scanning patterns. We propose a directional transform scheme based on direction-adaptive fixed length discrete cosine transform (DAFL-DCT) for intra-, and inter-frame to achieve higher coding efficiency in case of directional featured blocks.Furthermore, the proposed DAFL-DCT has the following two encoding modes.
(1) Direction-adaptive fixed length ― high efficiency (DAFL-HE) mode for higher
compression performance;
(2) Direction-adaptive fixed length ― low complexity (DAFL-LC) mode for low
complexity with a fair compression ratio.
On the other hand, motion estimation (ME) exploits temporal correlation between video frames and yields significant improvement in compression ratio while sustaining high visual quality in video coding. Block-matching motion estimation (BMME) is the most popular approach due to its simplicity and efficiency. However, the real-world video sequences may contain slow, medium and/or fast motion activities. Further, a single search pattern does not prove efficient in finding best matched block for all motion types. In addition, it is observed that most of the BMME schemes are based on uni-modal error surface. Nevertheless, real-world video sequences may exhibit a large number of local minima available within a search window and thus possess multi-modal error surface (MES). Hence, the following two uni-modal error surface based and multi-modal error surface based motion estimation
schemes are developed.
(1) Direction-adaptive motion estimation (DAME) scheme;
(2) Pattern-based modified particle swarm optimization motion estimation (PMPSO-ME)
scheme.
Subsequently, various fast and efficient foveated video compression schemes are developed with combination of these schemes to improve the video coding performance further while maintaining high visual quality to salient regions.
All schemes are incorporated into the H.264/AVC video coding platform. Various experiments have been carried out on H.264/AVC joint model reference software (version JM 18.6). Computing various benchmark metrics, the proposed schemes are compared
with other existing competitive schemes in terms of rate-distortion curves, Bjontegaard metrics (BD-PSNR, BD-SSIM and BD-bitrate), encoding time, number of search points and subjective evaluation to derive an overall conclusion.

Item Type:	Thesis (PhD)
Uncontrolled Keywords:	Block matching motion estimation (BMME); Direction adaptive transform;Discrete cosine transform (DCT); Foveated video compression (FVC); Human vision system (HVS); Motion estimation (ME); Saliency detection; Video coding
Subjects:	Engineering and Technology > Electronics and Communication Engineering > Image Processing
Divisions:	Engineering and Technology > Department of Electronics and Communication Engineering
ID Code:	8474
Deposited By:	Mr. Sanat Kumar Behera
Deposited On:	09 Feb 2017 16:26
Last Modified:	09 Feb 2017 16:26
Supervisor(s):	Meher, S

Repository Staff Only: item control page