Tiled adaptive streaming of live 360 video
A flood of 360-degree cameras have entered the market, ranging from the Ricoh Theta through various contraptions build from GoPros to specialised surveillance cameras. Most of these cameras require a lot of offline-post-processing before a 360 video that has been converted into the so-called equirectangular format can be downloaded, for example through YouTube.
But a few cameras provide live streams, and this is provides a great opportunity for viewers. With just a single camera installed at an event, the viewer can take control over what they see at any given time, using pan-tilt-zoom on their mobile phones while watching the live stream. Perhaps they can also jump a few seconds back and see another part of the view.
This thesis aims at investigating algorithms for reducing bandwidth waste while achieving short delays in streaming to Android and iOS mobile phones.
Adaptive video streaming over HTTP is a reality of our everyday live. From NRK to Netflix, every commercial provider of video broadcast services uses it to deliver their video content. It is used by YouTube and Vimeo for delivering all content that was produced in recent, and in fast, for vast amounts of other services.
It's big claim to fame is that you can get a simple service up-and-running by storing a video encoded in several qualities along with a describing file on a normal web server, and let a client download the quality that is best suited for its purposes.
With 360-degree video and VR video, there are more factors to consider. A viewer will not watch the full 360 degrees of the video, but only some degrees of the viewing array, perhaps up to 180 degrees. It does therefore make little sense to use precious bandwidth for downloading the complete video and show only half or less of all those downloaded pixels.
In this thesis, you will explore server-side and client-side options for retrieving only required data and evaluate the resulting quality.
Our existing baseline for live video streaming is built from the following pieces:
- the nginx web server and proxy
- the ffmpeg video coding tool
- RATS - a live encoder using Nvidia's NVenc encoder (DOI 10.1145/3304109.3323837)
Please note that the famous gpac software uses Kvazaar to create H.265 tiles. This gives great quality but needs Cloud servers to approach real-time speed. We want to avoid a Cloud solution in this topic because our final aim with this work is to minimize latency. You find more about tiled video streaming with gpac here: https://github.com/gpac/gpac/wiki/Tiled-Streaming
What you learn
- the video coding standards H.264 and H.265
- the standards for dealing with adaptive video streaming over TCP, MPEG DASH (ISO standard) and HLS (Apple's version)
- deep insight into TCP congestion control
- knowledge about the way in which users experience 360 and VR videos