Fast depth coding in 3D-HEVC using deep learning

Wang, Zhen-xiang

Author:	Wang, Zhen-xiang
Title:	Fast depth coding in 3D-HEVC using deep learning
Advisors:	Chan, Yui-lam (EIE)
Degree:	M.Sc.
Year:	2018
Subject:	Hong Kong Polytechnic University -- Dissertations Digital video Video compression
Department:	Department of Electronic and Information Engineering
Pages:	ix, 68 pages : color illustrations
Language:	English
Abstract:	The 3D Extension of the High Efficiency Video Coding standard (3D-HEVC), which has been finalized by the Joint Collaborative Team on Video Coding (JCT-VC) in February 2015, is the new industry standard for 3D applications. The 3D-HEVC provides plenty of advanced coding tools specifically for addressing the coding of auto-stereoscopic videos which have the format of multiple texture views plus depth maps which are responsible for synthesizing intermediate views with sufficient quality for auto-stereoscopic display. The provided tools take advantage of the statistical redundancies amongst texture views and depth maps in the video sequences, as well as the unique characteristics of depth maps to significantly shrink the bitrate while preserving the objective visual quality of the 3D videos. However, those tools with high capabilities in terms of compression come with high complexity of computation which has made the encoding time of 3D video sequences much longer than ever by traversing a lot more mode candidates than all the previous standards. Although the current encoding scheme in the 3D-HEVC standard is able to find the best intra mode candidate for each coding unit in depth maps, the cost of time for encoding is becoming a major obstacle for it to be applied to profitable products. In this dissertation we address the aforementioned time cost issue by a new intra mode decision algorithm for depth maps, leveraging deep learning to train computational models built from neural network for predicting the best intra angular mode in depth map coding. The predicted intra angular mode is utilized to decide the most probable wedgelet by which the number of wedgelet candidates can be reduced by half. The size of the neural network has been carefully designed to balance the trade-off between the complexity and accuracy in the model prediction. Validation precision and confusion matrix are used to monitor the model training process. Top-k metric is adopted to make use of the predictions from the learned models. We have integrated learned models into the reference software of 3D-HEVC for experiments. The compiled executable binaries are able to harness the power of simultaneous computation of CPU, as well as parallel computation of GPU to accelerate the predictions. The simulation results show that the proposed fast depth coding algorithm provides 64.6% time reduction in average while the BD performance has a trivial decrease comparing with the state-of-the-art 3D-HEVC standard.
Rights:	All rights reserved
Access:	restricted access

Files in This Item:

File	Description	Size	Format
991022144625303411.pdf	For All Users (off-campus access for PolyU Staff & Students only)	2.35 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9567