Author: | Liu, Zhisong |
Title: | Learning approach for image super-resolution |
Advisors: | Siu, Wan-chi (EIE) Chan, Yui-lam (EIE) |
Degree: | Ph.D. |
Year: | 2020 |
Subject: | Image processing -- Digital techniques Image reconstruction Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Electronic and Information Engineering |
Pages: | xi, 219 pages : color illustrations |
Language: | English |
Abstract: | Image interpolation and super-resolution (SR) are the fundamental problems in image processing. They are essential techniques for advanced image processing, like image compression, image classification, video coding and so on. For image interpolation, the goal is to predict the missing high-resolution (HR) pixels from known neighboring ground truth HR pixels. On the other hand, image super-resolution is a more complicate situation. The low-resolution (LR) pixel is modeled as a down-sampled convolution result of HR pixels with blurring or transform kernels. Without knowing the ground truth HR pixel, the goal is to generate a pleasing and alike HR image by preserving sharp edges and natural textures from the LR image. Firstly, we propose an image super-resolution via hybrid NEDI and wavelet-based scheme. For DWT down-sampled LR image, we propose to use shift-free NEDI (SF-NEDI) method to fix the pixel shift problem happened during the down-sampling process. Then we make use of the original LR image and the predicted high frequency information from SF-NEDI to further improve the super-resolution quality. We also propose to use wavelet-based k-nearest neighbor (k-NN) searching external images for estimating the LR image. The basic idea is to select k nearest patches as candidates for learning a regularized linear regression model for predicting the HR patches. Our proposed method uses 24 external images for dictionary preparation and partitions the LR image into overlapped patches for online up-sampling. During the up-sampling, the smooth area is super-resolved by DWT Zero-padding rather than convoluting with a universal filter. Our wavelet-based k-NN super-resolution method can achieve 1.2 dB in PSNR over SF-NEDI. The random forest model proposed recently has also been proven to be one of the state-of-the-art method, which can improve the image visual quality, meanwhile it can reduce the computation time significantly. To harvest the advantages of random forest, we propose several patch based fast image super-resolution based on random forests and their variations, including image Super-resolution via Weighted Random Forests (SWRF), Image Super-Resolution via Randomized Multi-split Forests (SRRMF) and Cascaded Random Forests for Image Super-Resolution (CRFSR). SWRF uses a proposed weighting model to learn the bias of each decision trees within the random forests to adaptively reconstruct SR images. SRRMF further randomizes the patch classification process of random forests to increase the model variety. To involve more feature points for patch classification, CRFSR can efficiently search high-level features in a hierarchical manner to screen sufficient feature points for image super-resolution. To further boost up the performance, an extra Gaussian Mixture Model (GMM) based layer is used for final refinement. Furthermore, benefitting from the great advance on Convolutional Neural Network (CNN), we propose several CNN based learning approaches for image super-resolution to achieve much better performance both quantitatively and qualitatively. We firstly propose to jointly combine Back Projection and Residual Networks (BPRN) for efficient image super-resolution. It combines the residual learning and back projection mechanism to learn deeper feature representation to achieve fast and accurate image super-resolution. Then we further study back projection based residual learning to come up with Hierarchical Back Projection Network (HBPN) for image super-resolution. It is built on hierarchically stacked HourGlass models to Top-down and bottom-up process the LR and HR feature maps to boost up super-resolution quality. Under large scale of down-sampling conditions, the aforementioned methods focus on finding the lowest data distortion so that they tend to generate oversmooth SR images. Subsequently, we propose a novel photo-realistic image Super-Resolution via Variational AutoEncoder (SR-VAE) to generate SR images with rich HR alike patterns. Finally, we also give some study on domain-specific applications of image super-resolution. We propose Video Super-Resolution via Hierarchical Temporal Residual Networks to super-resolve LR video frames. By using the proposed Spatial-Temporal Convolution, it can save up to 60% computation and achieve 0.5 dB improvement over other state-of-the-art video SR. We also propose to combine both Random Forests and CNN together to resolve 8x face SR to preserve facial identity as well as provide sharp visual quality. The core idea is to use CNN for global facial reconstruction and random forests for facial feature refinement. Furthermore, we also propose a novel idea on reference based face SR (RefSR), which can for a new research direction in the future. It makes use of reference facial images to guide LR images super-resolution via Variational AutnEncoder (RefSR-VAE). By encoding reference images and LR images to learn conditional latent parameters, we use the decoder to reconstruct the SR images. With the guidance of reference images, our RefSR-VAE can achieve the state-of-the-art SR performance. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/11074