Wavelet-based super-resolution and video coding

Liu, Zhi Song

Author:	Liu, Zhi Song
Title:	Wavelet-based super-resolution and video coding
Advisors:	Siu, Wan-chi (EIE)
Degree:	M.Sc.
Year:	2016
Subject:	Hong Kong Polytechnic University -- Dissertations Image processing -- Digital techniques Digital video Video compression
Department:	Department of Electronic and Information Engineering
Pages:	xvi, 87 pages : color illustrations
Language:	English
Abstract:	Image interpolation and super-resolution (SR) are the fundamental problems in image process; yet they are essential to advanced image processing techniques, like image compression, image classification, video coding and so on. For image interpolation, the goal is to predict the missing high-resolution (HR) pixels from the known neighboring ground truth HR pixels. On the other hand, image super-resolution is a more complicate situation. The low-resolution (LR) pixel is modeled as down-sampled convolution result of HR pixels with blurring kernel. Without knowing the ground truth HR pixel, the goal is to generate a pleasing and alike HR image by preserve sharp edges and natural textures from the LR image. For video coding, with the advanced DCT-based hybrid coding techniques, people are able to use less bitrate to encode the video with good quality. However, varying devices with different resolutions require the same video with different resolution. Wavelet-based scalable video coding schemes happen to have the multiresolution analysis property that can solve this problem naturally. Combing with some state-of-the-art super-resolution methods, making super-resolution for the LR videos has been an interesting topic for research. The goal is to relax the computation complex at the encoder and further reduce the bitrate, yet we still obtain good result at the decoder by using some super-resolution methods. Firstly we propose an image super-resolution via hybrid NEDI and wavelet-based scheme. For DWT down-sampled LR image, we propose to use shift-free NEDI (SF-NEDI) method to fix the pixel shift problem happened during the down-sampling process. Then we make use of the original LR image and the predicted high frequency information from SF-NEDI to further improve the super-resolution quality. Extensive experimental results show that SF-NEDI achieves 0.5 dB improvement in PSNR over the NEDI. The hybrid wavelet-based scheme achieves 0.7 dB improvement over the DWT Zero-padding. We also propose to use wavelet-based k-nearest neighbor (k-NN) searching external images for estimating the LR image. The basic idea is to select k nearest patches as candidates for learning a regularized linear regression model for predicting the HR patches. Our proposed method uses 24 external images for dictionary preparation and partitions the LR image into overlapped patches for online up-sampling. During the up-sampling, the smooth area is super-resolved by DWT Zero-padding rather than convoluting with a universal filter. Our wavelet-based k-NN super-resolution method can achieve 1.2 dB in PSNR over SF-NEDI. Video super-resolution via random forest method succeeds to make use of the random forest model for LR video super-resolution. The first option is to down-sample the HR video and encode and decode it by non-scalable 3D-DWT coding. The other option is that the HR video is encoded by scalable 3D-DWT coding and decode corresponding LR video. Either way, at the decoder, a learned random forest dictionary is used to super-resolve the LR video to form a HR video. For direct LR video coding followed by super-resolution, we compare the result of LR video super-resolution via random forest with simple interpolation and found that using learning-based method can improve the PSNR significantly. In the extension part, we discuss about using scalable 3D-DWT coding to decode a LR video and perform random forest super-resolution. We propose to use different training set for different decision tree within the random forest to generate a more robust and accurate estimation. Extensive experimental results show that it can improve the super-resolution PSNR at low bitrate. Meanwhile, we reduce the size of the trained dictionary so that it is small enough for transmission and the up-sampling process is fast enough for potentially real-time application.
Rights:	All rights reserved
Access:	restricted access

Files in This Item:

File	Description	Size	Format
991022131147003411.pdf	For All Users (off-campus access for PolyU Staff & Students only)	6.58 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/9447