|Title:||Adaptive integer kernels and dyadic approximation error analysis for state-of-the-art video codecs|
|Subject:||Hong Kong Polytechnic University -- Dissertations|
|Department:||Department of Electronic and Information Engineering|
|Pages:||xiii, 86 leaves : ill. (some col.) ; 30 cm.|
|Abstract:||In this thesis, new integer kernels are found and adaptive transform coding techniques are proposed to improve the coding efficiency of state-of-the-art video codecs with detailed analyses. The nonorthogonality error analysis is extended and improved. An error caused by the dyadic fraction approximation due to the integerization of transform coding is defined and followed by deep investigation. The desire for the removal of the mismatch between encoder and decoder has been ever increasing. In the state-of-the-art video coding standard - the H.264/AVC - the transform coding stage was thus integerized to cope with this desire. One of our objectives is to improve the coding efficiency of video codec based on this "integer framework". We propose a new DCT-like integer kernel IK(5,7,3) and revitalize another DCT-like integer kernel IK(13,17,7) for the transform coding process of hybrid video coding. Making use one of these kernels together with the H.264/AVC Kernel IK(1,2,1), we are able to design new multiple-kernel schemes which give better coding performance over that of the conventional approaches. All these schemes make use of the Adaptive Kernel Mechanism (AKM) at macroblock level, which requires heavy computation during the encoding process. We subsequently discovered that a rate-distortion feature extracted from a pair of kernels gives an intrinsic property that can be used to select a better kernel for a two-kernel macroblock-level AKM system. This is a power tool with theoretical interest and practical uses. In order to reduce computation substantially, we make use of this tool to make an analysis and design of a frame-level AKM and come up with a simple solution that the kernel IK(1,2,1) be used for I- and P-Frames and the kernel IK(5,7,3) be used for B-Frames coding. This proposed frame-level AKM is similar, or even better, than the proposed macroblock-level AKM. Furthermore it substantially reduces computation and certainly gives a good improvement in terms of the PSNR and bitrate compared to those obtained from the H.264/AVC default arrangement and other macroblock-level AKM schemes available in the literature.|
Nowadays, the demand for large-size (e.g. 16×16) integer transform kernels is increasing due to the explosive increase of resolution of videos. However, the orthogonality constraint for designing 16×16 integer kernels is much stronger than that for designing 4×4 kernels. Hence, several kernel designs violating the constraint in a controllable manner which roughly ensures the orthogonality have been proposed. An error analysis by Dong et al. showed that the well-controlled nonorthogonality noise is approximately negligible as compared to the quantization noise. In this thesis, we enhance the original analysis by pointing out three problems found in derivations and also giving two comments. Nevertheless, the problems are defects only, hence do not affect the overall justifications to the nonorthogonality analysis. Although the integerization of transform coding process ensures no mismatch between encoder and decoder, it also introduces a by-product and we define it as the "dyadic approximation error" which can largely affect the visual quality of a reconstructed video sequence. We derive the analytical forms of the dyadic approximation error, and compare the significances among possible error terms (i.e. the quantization error, nonorthogonality error, and dyadic approximation error) using various transform kernels. We conclude that the dyadic approximation error is much larger than the nonorthogonality error, and it is comparable to the quantization error for fine quantization. We point out that the existence of this error is equivalent to scaling each frequency component by a position dependent scalar which is slightly larger or smaller than the unity, and also quantizing them with different stepsizes. Hence in the reconstruction process, many distorted frequency components are used, and eventually a reconstructed frame with frequency artifacts is generated. The conditions to eliminate the effect of dyadic approximation error for 16×16 transform kernels are found by experimental work. On the whole, inspired by the establishment of the "integer framework" since the emergence of the H.264/AVC, we carry out a comprehensive investigation on the old problems under the new constraint, starting from the optimization of coding performance to the analyses of errors.
Files in This Item:
|b24250491.pdf||For PolyU Staff & Students||2.44 MB||Adobe PDF||View/Open|
|13532.pdf||For All Users (Non-printable)||2.47 MB||Adobe PDF||View/Open|
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: