Towards lightweight and efficient network design for image super-resolution

Zhang, Xindong

Author:	Zhang, Xindong
Title:	Towards lightweight and efficient network design for image super-resolution
Advisors:	Zhang, Lei (COMP)
Degree:	Ph.D.
Year:	2023
Subject:	High resolution imaging Image processing -- Digital techniques Hong Kong Polytechnic University -- Dissertations
Department:	Department of Computing
Pages:	xvii, 129 pages : color illustrations
Language:	English
Abstract:	Deep neural network-based image super-resolution (SR) models have become prevalent for their strong capability in recovering high-frequency details. Due to the constraint of hardware resources in many practical applications, light-weight and efficient network design for SR models has become highly demanded. However, most of the existing studies focusing on reducing the number of parameters and FLOPs of SR models, which may not necessarily lead to faster running speed on target devices. In this thesis, we study the network design and evaluation for the lightweight and efficient SR model on two typical and distinct scenarios, i.e., mobile devices, and GPU servers. It is very challenging to design lightweight and efficient SR models on mobile devices due to the small amount of RAM (≤ 8 GB), limited computation (around 4 TOPs), and battery (around 4000 mAh) capacity. By analyzing the computational architecture of recent lightweight SR designs on commodity mobile devices, in Chapter 2 we propose a re-parameterizable block as the drop-in replacement of 3 x 3 convolution, namely Edge-oriented Convolution Block (ECB), which can be trained in multi-branches and folded back into a single convolution kernel for fast inference. Furthermore, to liberate the researchers from laborious design and deployment of SR models on mobile devices, in Chapter 3 we propose an efficient hardware-aware neural architecture search (EHANAS) method, which supports a large network design space and provides a good balance between SR model design quality and efficiency with the cost of one GPU day. As for the scenario with GPU servers, efficiency is also an important requirement although more hardware resources are available. In Chapter 4, we propose an efficient long-range attention network (ELAN) for image SR. A highly efficient long-range attention block (ELAB) is developed by simply cascading two shift-conv with a group-wise multi-scale self-attention (GMSA) module, which is further accelerated by using a shared attention mechanism. Finally, in Chapter 5, we develop an online benchmark to automatically evaluate the performance of SR models on mobile devices. Based on the benchmark, comprehensive studies of current SR models are presented. In summary, in this thesis, we present three lightweight and efficient SR network designs, as well as an online benchmark for evaluating the performance of SR models on mobile devices. Among them, ECB provides a real-time network design for SR tasks. EHANAS frees researchers from the laborious trade-off of network design between PSNR/SSIM indices and inference speed. ELAN gives insight on efficiently modeling long-range dependency among image features. Finally, the online benchmark provides a common platform for researchers to easily evaluate and compare the practical performance of their SR models on mobile devices, freeing them from labour-intensive SR model deployment works.
Rights:	All rights reserved
Access:	open access

Files in This Item:

File	Description	Size	Format
6752.pdf	For All Users	46.04 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12305