Author: | Wei, Lai |
Title: | Towards efficient and robust volumetric video streaming |
Advisors: | Wang, Dan (COMP) |
Degree: | Ph.D. |
Year: | 2024 |
Subject: | Streaming video Streaming technology (Telecommunications) 3-D video (Three-dimensional imaging) Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Computing |
Pages: | xvi, 122 pages : color illustrations |
Language: | English |
Abstract: | Immersive video streaming has been drawing escalating attention in recent years and is the foundation of a range of key VR applications, e.g., teleconferencing, remote teaching, sports game broadcasting, and so on. Among many forms of immersive videos, e.g., 360-degree video, Neural Radiance Fields (NeRF), and 3D Gaussian splatting(3DGS), point cloud-based volumetric video is of particular interest due to its good balance between low device dependency, low computation cost, and high immersiveness. Many existing works focus on improving bandwidth efficiency and adaptiveness under local networks but do not discuss the challenges facing the public Internet. This difference could introduce a more stringent bandwidth constraint, vulnerability to security attacks, and diversified viewing conditions. Therefore, it is necessary to propose new schemes and methods to address these new challenges. In this thesis, we conduct an in-depth study of these new problems and make the following original contributions. Firstly, we propose a bandwidth-efficient volumetric video streaming framework VSAS that, for the first time, allows DASH-based video streaming of MPEG V-PCC formatted volumetric video streaming. MPEG V-PCC is a new standard for volumetric video compression, featuring a high compression ratio and effective temporal prediction. However, it is not readily applicable to work with dynamic network environment streaming. First, there is a need for a rate-distortion model for MPEG V-PCC, which is essential for achieving effective bitrate control. Therefore, we conducted one of the earliest rate-distortion studies on MPEG V-PCC and proposed a geometry-aware model that achieves high accuracy; second, we designed a transformer-based offline reinforcement learning method to control the Bitrate according to the network dynamics and user movements; third, as the coarse-grained DASH architecture causes frequent frame freezing, we propose a DAG-based frame dropping mechanism that enables the existing system with a frame rate scaling capability. Together, our VSAS framework delivers a smooth Internet volumetric video streaming service. Extensive experiments reveal that VSAS has achieved a lower stalling effect, better bandwidth efficiency, and higher visual quality than existing systems. Secondly, we study the generalization problem in tile-based volumetric video streaming systems and propose FewVV. We first identified the limitation of the existing system when facing an out-of-distribution environment, which essentially constrains the real-world deployment of the tile pruning-based optimization of these systems. To tackle this challenge, we noticed the few-shot and zero-shot adaptation ability of the large language models; therefore, we first reformulate the volumetric video streaming control into a multi-variate sequence modeling problem, then train a causal transformer model with prompt-tuning to solve it. Our evaluation demonstrates a consistent improvement compared to several baselines regarding the QoE and the adaptation speed to an unseen environment. Finally, we study the error concealing problem of volumetric video and built a novel dataset, VVCorupt. We first introduce the background of the existing error-concealing algorithms and related datasets, then we identify a lack of existing dataset for training and bench marking the error concealing algorithms. We build a corruption model for the volumetric video streaming according to the network models, and then build a large scale error concealing dataset with reference frame. We analyze the corruption patterns in our collected dataset and point out the potential directions for building an effective error concealing models for volumetric videos. In summary, we conducted an in-depth study on three major challenges (efficiency, privacy, and generalization) toward a better volumetric video streaming system and proposed effective methods to tackle them. We evaluate our methods by evaluating prototype systems over various conditions to confirm their applicability. At the end of the thesis, we reveal several insights for future research. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/13505