Thursday, May 27, 2021

IEEE Communication Magazine: From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

Teaser: “Help me, Obi-Wan Kenobi. You’re my only hope,” said the hologram of Princess Leia in Star Wars: Episode IV – A New Hope (1977). This was the first time in cinematic history that the concept of holographic-type communication was illustrated. Almost five decades later, technological advancements are quickly moving this type of communication from science fiction to reality.

IEEE Communication Magazine

[PDF]

Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Tim Wauters (Ghent University), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), and Raimund Schatz (AIT Austrian Institute of Technology)

Abstract: Technological improvements are rapidly advancing holographic-type content distribution. Significant research efforts have been made to meet the low-latency and high-bandwidth requirements set forward by interactive applications such as remote surgery and virtual reality. Recent research made six degrees of freedom (6DoF) for immersive media possible, where users may both move their heads and change their position within a scene. In this article, we present the status and challenges of 6DoF applications based on volumetric media, focusing on the key aspects required to deliver such services. Furthermore, we present results from a subjective study to highlight relevant directions for future research.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Saturday, May 22, 2021

VCIP’20: FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

IEEE International Conference on Visual Communications and Image Processing (VCIP)
1-4 December 2020, Macau

Ekrem Çetinkaya, Hadi Amirpour, Christian Timmerer, and Mohammad Ghanbari
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: HTTP Adaptive Streaming(HAS) is the most common approach for delivering video content over the Internet. The requirement to encode the same content at different quality levels (i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we propose to use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that FaME-ML achieves significant time-complexity savings in parallel encoding scenarios(41%in average) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, HTTP Adaptive Streaming (HAS)

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

ACM Multimedia’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM MM’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM International Conference on Multimedia 2020, Seattle, United States.
https://2020.acmmm.org

[PDF][Slides][Video]

Negin Ghamsarian (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Mario Taschwer (Alpen-Adria-Universität Klagenfurt), and Klaus Schöffmann (Alpen-Adria-Universität Klagenfurt)

Abstract: Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, ROI Detection, Medical Multimedia.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Friday, May 21, 2021

EPIQ’20: Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC Using Retransmission

Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC Using Retransmission

ACM SIGCOMM 2020 Workshop on Evolution, Performance, and Interoperability of QUIC (EPIQ 2020)
August 10–14, 2020
https://conferences.sigcomm.org/sigcomm/2020/workshop-epiq.html

[PDF][Slides][Video]

Minh Nguyen, Hadi Amirpour, Christian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: HTTP/2 has been explored widely for video streaming, but still suffers from Head-of-Line blocking, and three-way hand-shake delay due to TCP. Meanwhile, QUIC running on top of UDP can tackle these issues. In addition, although many adaptive bitrate (ABR) algorithms have been proposed for scalable and non-scalable video streaming, the literature lacks an algorithm designed for both types of video streaming approaches. In this paper, we investigate the impact of quick and HTTP/2 on the performance of adaptive bitrate(ABR) algorithms in terms of different metrics. Moreover, we propose an efficient approach for utilizing scalable video coding formats for adaptive video streaming that combines a traditional video streaming approach (based on non-scalable video coding formats) and a retransmission technique. The experimental results show that QUIC benefits significantly from our proposed method in the context of packet loss and retransmission. Compared to HTTP/2, it improves the average video quality and also provides a smoother adaptation behavior. Finally, we demonstrate that our proposed method originally designed for non-scalable video codecs also works efficiently for scalable videos such as Scalable High EfficiencyVideo Coding(SHVC).

Keywords: QUIC, H2BR, HTTP adaptive streaming, Retransmission, SHVC

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Tuesday, May 18, 2021

ICME’20: Towards View-aware Adaptive Streaming of Holographic content

Towards View-aware Adaptive Streaming of Holographic content

IEEE International Conference on Multimedia & Expo (ICME) 2020, London, UK.

Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience

[PDF][Slides][Video]

Hadi Amirpour, Christian Timmerer, and Mohammad Ghanbari (University of Essex)
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Holography is able to reconstruct a three-dimensional structure of an object by recording full wave fields of light emitted from the object. This requires a huge amount of data to be encoded, stored, transmitted, and decoded for holographic content, making its practical usage challenging especially for bandwidth-constrained networks and memory-limited devices. In the delivery of holographic content via the internet, bandwidth wastage should be avoided to tackle high bandwidth demands of holography streaming. For real-time applications, encoding time-complexity is also a major problem. In this paper, the concept of dynamic adaptive streaming over HTTP (DASH) is extended to holography image streaming and view-aware adaptation techniques are studied. As each area of a hologram contains information of a specific view, instead of encoding and decoding the entire hologram, just the part required to render the selected view is encoded and transmitted via the network based on the users’ interactivity. Four different strategies, namely, monolithic, single view, adaptive view, and non-real time streaming strategies are explained and compared in terms of bandwidth requirements, encoding time-complexity, and bitrate overhead. Experimental results show that the view-aware methods reduce the required bandwidth for holography streaming at the cost of a bitrate increase.

Keywords: Holography, compression, bitrate adaptation, dynamic adaptive streaming over HTTP, DASH.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Saturday, May 15, 2021

MPEG news: a report from the 134th meeting (virtual)

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will also be posted at ACM SIGMM Records.

MPEG News Archive

The 134th MPEG meeting was once again held as an online meeting, and the official press release can be found here and comprises the following items:

  • First International Standard on Neural Network Compression for Multimedia Applications
  • Completion of the carriage of VVC and EVC
  • Completion of the carriage of V3C in ISOBMFF
  • Call for Proposals: (a) new Advanced Genomics Features and Technologies, (b) MPEG-I Immersive Audio, and (c) coded Representation of Haptics
  • MPEG evaluated Responses on Incremental Compression of Neural Networks
  • Progression of MPEG 3D Audio Standards
  • The first milestone of development of Open Font Format (2nd amendment)
  • Verification tests: (a) low Complexity Enhancement Video Coding (LCEVC) Verification Test and (b) more application cases of Versatile Video Coding (VVC)
  • Standardization work on Version 2 of VVC and VSEI started

In this column, the focus is on streaming-related aspects including a brief update about MPEG-DASH.

First International Standard on Neural Network Compression for Multimedia Applications

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors, or image and video coding. The trained neural networks for these applications contain many parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to several clients (e.g., mobile phones, smart cameras) benefits from a compressed representation of neural networks.

At the 134th MPEG meeting, MPEG Video ratified the first international standards on Neural Network Compression for Multimedia Applications (ISO/IEC 15938-17), designed as a toolbox of compression technologies. The specification contains different methods for

  • parameter reduction (e.g., pruning, sparsification, matrix decomposition),
  • parameter transformation (e.g., quantization), and
  • entropy coding 

methods that can be assembled to encoding pipelines combining one or more (in the case of reduction) methods from each group.

The results show that trained neural networks for many common multimedia problems such as image or audio classification or image compression can be compressed by a factor of 10-20 with no performance loss and even by more than 30 with performance trade-off. The specification is not limited to a particular neural network architecture and is independent of the neural network exchange format choice. The interoperability with common neural network exchange formats is described in the annexes of the standard.

As neural networks are becoming increasingly important, the communication thereof over heterogeneous networks to a plethora of devices raises various challenges including efficient compression that is inevitable and addressed in this standard. ISO/IEC 15938 is commonly referred to as MPEG-7 (or the “multimedia content description interface”) and this standard becomes now part 15 of MPEG-7.

Research aspects: Like for all compression-related standards, research aspects are related to compression efficiency (lossy/lossless), computational complexity (runtime, memory), and quality-related aspects. Furthermore, the compression of neural networks for multimedia applications probably enables new types of applications and services to be deployed in the (near) future. Finally, simultaneous delivery and consumption (i.e., streaming) of neural networks including incremental updates thereof will become a requirement for networked media applications and services.

Carriage of Media Assets

At the 134th MPEG meeting, MPEG Systems completed the carriage of various media assets in MPEG-2 Systems (Transport Stream) and the ISO Base Media File Format (ISOBMFF), respectively.

In particular, the standards for the carriage of Versatile Video Coding (VVC) and Essential Video Coding (EVC) over both MPEG-2 Transport Stream (M2TS) and ISO Base Media File Format (ISOBMFF) reached their final stages of standardization, respectively:

  • For M2TS, the standard defines constraints to elementary streams of VVC and EVC to carry them in the packetized elementary stream (PES) packets. Additionally, buffer management mechanisms and transport system target decoder (T-STD) model extension are also defined.
  • For ISOBMFF, the carriage of codec initialization information for VVC and EVC is defined in the standard. Additionally, it also defines samples and sub-samples reflecting the high-level bitstream structure and independently decodable units of both video codecs. For VVC, signaling and extraction of a certain operating point are also supported.

Finally, MPEG Systems completed the standard for the carriage of Visual Volumetric Video-based Coding (V3C) data using ISOBMFF. Therefore, it supports media comprising multiple independent component bitstreams and considers that only some portions of immersive media assets need to be rendered according to the users' position and viewport. Thus, the metadata indicating the relationship between the region in the 3D spatial data to be rendered and its location in the bitstream is defined. In addition, the delivery of the ISOBMFF file containing a V3C content over DASH and MMT is also specified in this standard.

Research aspects: Carriage of VVC, EVC, and V3C using M2TS or ISOBMFF provides an essential building block within the so-called multimedia systems layer resulting in a plethora of research challenges as it typically offers an interoperable interface to the actual media assets. Thus, these standards enable efficient and flexible provisioning or/and use of these media assets that are deliberately not defined in these standards and subject to competition.

Call for Proposals and Verification Tests

At the 134th MPEG meeting, MPEG issued three Call for Proposals (CfPs) that are briefly highlighted in the following:

  • Coded Representation of Haptics: Haptics provide an additional layer of entertainment and sensory immersion beyond audio and visual media. This CfP aims to specify a coded representation of haptics data, e.g., to be carried using ISO Base Media File Format (ISOBMFF) files in the context of MPEG-DASH or other MPEG-I standards.
  • MPEG-I Immersive Audio: Immersive Audio will complement other parts of MPEG-I (i.e., Part 3, “Immersive Video” and Part 2, “Systems Support”) in order to provide a suite of standards that will support a Virtual Reality (VR) or an Augmented Reality (AR) presentation in which the user can navigate and interact with the environment using 6 degrees of freedom (6 DoF), that being spatial navigation (x, y, z) and user head orientation (yaw, pitch, roll).
  • New Advanced Genomics Features and Technologies: This CfP aims to collect submissions of new technologies that can (i) provide improvements to the current compression, transport, and indexing capabilities of the ISO/IEC 23092 standards suite, particularly applied to data consisting of very long reads generated by 3rd generation sequencing devices, (ii) provide the support for representation and usage of graph genome references, (iii) include coding modes relying on machine learning processes, satisfying data access modalities required by machine learning and providing higher compression, and (iv) support of interfaces with existing standards for the interchange of clinical data.

Detailed information, including instructions on how to respond to the call for proposals, the requirements that must be considered, the test data to be used, and the submission and evaluation procedures for proponents are available at www.mpeg.org.

Call for proposals typically mark the beginning of the formal standardization work whereas verification tests are conducted once a standard has been completed. At the 134th MPEG meeting and despite the difficulties caused by the pandemic situation, MPEG completed verification tests for Versatile Video Coding (VVC) and Low Complexity Enhancement Video Coding (LCEVC).

For LCEVC, verification tests measured the benefits of enhancing four existing codecs of different generations (i.e., AVC, HEVC, EVC, VVC) using tools as defined in LCEVC within two sets of tests:

  • The first set of tests compared LCEVC-enhanced encoding with full-resolution single-layer anchors. The average bit rate savings produced by LCEVC when enhancing AVC were determined to be approximately 46% for UHD and 28% for HD. When enhancing HEVC approximately 31% for UHD and 24% for HD. Test results tend to indicate an overall benefit also when using LCEVC to enhance EVC and VVC.
  • The second set of tests confirmed that LCEVC provided a more efficient means of resolution enhancement of half-resolution anchors than unguided up-sampling. Comparing LCEVC full-resolution encoding with the up-sampled half-resolution anchors, the average bit-rate savings when using LCEVC with AVC, HEVC, EVC and VVC were calculated to be approximately 28%, 34%, 38%, and 32% for UHD and 27%, 26%, 21%, and 21% for HD, respectively.

For VVC, it was already the second round of verification testing including the following aspects:

  • 360-degree video for equirectangular and cubemap formats, where VVC shows on average more than 50% bit rate reduction compared to the previous major generation of MPEG video coding standard known as High Efficiency Video Coding (HEVC), developed in 2013.
  • Low-delay applications such as compression of conversational (teleconferencing) and gaming content, where the compression benefit is about 40% on average,
  • HD video streaming, with an average bit rate reduction of close to 50%.

A previous set of tests for 4K UHD content completed in October 2020 had shown similar gains. These verification tests used formal subjective visual quality assessment testing with “naïve” human viewers. The tests were performed under a strict hygienic regime in two test laboratories to ensure safe conditions for the viewers and test managers.

Research aspects: CfPs offer a unique possibility for researchers to propose research results for adoption into future standards. Verification tests provide objective or/and subjective evaluations of standardized tools which typically conclude the life cycle of a standard. The results of the verification tests are usually publicly available and can be used as a baseline for future improvements of the respective standards including the evaluation thereof.

DASH Update!

Finally, I’d like to provide a brief update on MPEG-DASH! At the 134th MPEG meeting, MPEG Systems recommended the approval of ISO/IEC FDIS 23009-1 5th edition. That is, the MPEG-DASH core specification will be available as 5th edition sometime this year. Additionally, MPEG requests that this specification becomes freely available which also marks an important milestone in the development of the MPEG-DASH standard. Most importantly, the 5th edition of this standard incorporates CMAF support as well as other enhancements defined in the amendment of the previous edition. Additionally, the MPEG-DASH subgroup of MPEG Systems is already working on the first amendment to its 5th edition entitled preroll, nonlinear playback, and other extensions. It is expected that the 5th edition will also impact related specifications within MPEG but also in other Standards Developing Organizations (SDOs) such as DASH-IF, i.e., defining interoperability points (IOPs) for various codecs and others, or CTA WAVE (Web Application Video Ecosystem), i.e., defining device playback capabilities such as the Common Media Client Data (CMCD). Both DASH-IF and CTA WAVE provide means for (conformance) test infrastructure for DASH and CMAF.

An updated overview of DASH standards/features can be found in the Figure below.

MPEG-DASH status as of April 2021.

Research aspects: MPEG-DASH has been ratified almost ten years ago which resulted in a plethora of research articles, mostly related to adaptive bitrate (ABR) algorithms and their impact on the streaming performance including the Quality of Experience (QoE). An overview of bitrate adaptation schemes is provided here including a list of open challenges and issues.

The 135th MPEG meeting will be again an online meeting in July 2021. Click here for more information about MPEG meetings and their developments.

Wednesday, May 12, 2021

NetSoft2020: On Optimizing Resource Utilization in AVC-based Real-time Video Streaming

On Optimizing Resource Utilization in AVC-based Real-time Video Streaming

IEEE Conference on Network Softwarization

29 June-3 July 2020 // Ghent, Belgium

http://netsoft2020.netsoft-ieee.org

[PDF][Slides][Video]

Alireza Erfanian‡, Farzad Tashtarian†, Reza Farahani‡, Christian Timmerer‡, Hermann Hellwagner‡

‡ Christian Doppler Laboratory ATHENA, Institute of Information Technology (ITEC), Alpen-Adria-Universit ̈at Klagenfurt, Austria
† Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran

Abstract—Real-time video streaming traffic and related applications have witnessed significant growth in recent years. However, this has been accompanied by some challenging issues, predominantly resource utilization. IP multicasting, as a solution to this problem, suffers from many problems. Using scalable video coding could not gain wide adoption in the industry, due to reduced compression efficiency and additional computational complexity. The emerging software-defined networking (SDN)and network function virtualization (NFV) paradigms enable re-searchers to cope with IP multicasting issues in novel ways. In this paper, by leveraging the SDN and NFV concepts, we introduce a cost-aware approach to provide advanced video coding (AVC)-based real-time video streaming services in the network. In this study, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder (VTF)functions. At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller. Then, executing a mixed-integer linear program (MILP) determines an optimal multicast tree from an appropriate set of video source servers to the optimal group of transcoders. The desired video is sent over the multicast tree. The VTFs transcode the received video segments and stream them to the requested VRPs over unicast paths. To mitigate the time complexity of the proposed MILPmodel, we propose a heuristic algorithm that determines a near-optimal solution in a reasonable amount of time. Using the MiniNet emulator, we evaluate the proposed approach and show it achieves better performance in terms of cost and resource utilization in comparison with traditional multicast and unicast approaches.

Keywords—Dynamic Adaptive Streaming over HTTP (DASH), Real-time Video Streaming, Software Defined Networking (SDN), Video Transcoding, Network Function Virtualization (NFV).

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Monday, May 10, 2021

DCC’20: Fast Multi-Rate Encoding for Adaptive HTTP Streaming

 Fast Multi-Rate Encoding for Adaptive HTTP Streaming

Data Compression Conference 2020, March 24 – 27, Cliff Lodge, Snowbird, UT

[PDF] [SigPort]

Hadi Amirpour, Ekrem Çetinkaya, Christian Timmerer, and Mohammad Ghanbari (University of Tehran, University of Essex)
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Adaptive HTTP streaming is the preferred method to deliver multimedia content on the internet. It provides multiple representations of the same content in different qualities (i.e., bit-rates and resolutions) and allows the client to request segments from the available representations in a dynamic, adaptive way depending on its context. The growing number of representations in adaptive HTTP streaming makes encoding of one video segment at different representations a challenging task in terms of encoding time-complexity. In this paper, information of both highest and lowest quality representations are used to limit Rate Distortion Optimization (RDO) for each Coding Unit Tree (CTU) in High Efficiency Video Coding. Our proposed method first encodes the highest quality representation and consequently uses it to encode the lowest quality representation. In particular, the block structure and the selected reference frame of both highest and lowest quality representations are then used to predict and shorten the RDO process of each CTU for intermediate quality representations. Our proposed method introduces a delay of two CTUs thanks to employing parallel processing techniques. Experimental results show a significant reduction in time-complexity over the reference software (38%) and state-of-the-art (10%) is achieved while quality degradation is negligible.

Keywords:  HTTP adaptive streaming, Multi-rate encoding, HEVC, Fast block partitioning

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Friday, May 7, 2021

PV’20: H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive Video Streaming

H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive Video Streaming

Packet Video Workshop 2020 (PV)
June 10-11, 2020, Istanbul, Turkey (co-located with ACM MMSys’20)

https://2020.packet.video/

[PDF][Slides][Video]

Minh Nguyen, Christian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: HTTP-based Adaptive Streaming (HAS) plays a key role in over-the-top video streaming. It contributes towards reducing the rebuffering duration of video playout by adapting the video quality to the current network conditions. However, it incurs variations of video quality in a streaming session because of the throughput fluctuation, which impacts the user’s Quality of Experience (QoE). Besides, many adaptive bitrate (ABR) algorithms choose the lowest-quality segments at the beginning of the streaming session to ramp up the playout buffer as soon as possible. Although this strategy decreases the startup time, the users can be annoyed as they have to watch a low-quality video initially. In this paper, we propose an efficient retransmission technique, namely H2BR, to replace low-quality segments being stored in the playout buffer with higher-quality versions by using features of HTTP/2 including (i) stream priority, (ii) server push, and (iii) stream termination. The experimental results show that H2BR helps users avoid watching low video quality during video playback and improves the user’s QoE. H2BR can decrease by up to more than 70% the time when the users suffer the lowest-quality video as well as benefits the QoE by up to 13%.

Keywords: HTTP adaptive streaming, DASH, ABR algorithms, QoE, HTTP/2

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

MMSys’20: Cloud-based Adaptive Video Streaming Evaluation Framework for the Automated Testing of Media Players (CAdViSE)

CAdViSE: Cloud-based Adaptive Video Streaming Evaluation Framework for the Automated Testing of Media Players

ACM Multimedia Systems Conference 2020 (MMSys 2020)
https://2020.acmmmsys.org/
Babak Taraghi, Anatoliy Zabrovskiy, Christian Timmerer and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Attempting to cope with fluctuations of network conditions in terms of available bandwidth, latency and packet loss, and to deliver the highest quality of video (and audio) content to users, research on adaptive video streaming has attracted intense efforts from the research community and huge investments from technology giants. How successful these efforts and investments are, is a question that needs precise measurements of the results of those technological advancements. HTTP-based Adaptive Streaming (HAS) algorithms, which seek to improve video streaming over the Internet, introduce video bitrate adaptivity in a way that is scalable and efficient. However, how each HAS implementation takes into account the wide spectrum of variables and configuration options, brings a high complexity to the task of measuring the results and visualizing the statistics of the performance and quality of experience. In this paper, we introduce CAdViSE, our Cloud-based Adaptive Video Streaming Evaluation framework for the automated testing of adaptive media players. The paper aims to demonstrate a test environment which can be instantiated in a cloud infrastructure, examines multiple media players with different network attributes at defined points of the experiment time, and finally concludes the evaluation with visualized statistics and insights into the results.

Keywords: HTTP Adaptive Streaming, Media Players, MPEG-DASH, Network Emulation, Automated Testing, Quality of Experience

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

ACM Reference Format:
Babak Taraghi, Anatoliy Zabrovskiy, Christian Timmerer, and Hermann Hellwagner. 2020. CAdViSE: Cloud-based Adaptive Video Streaming Evaluation Framework for the Automated Testing of Media Players. In 11th ACM Multimedia Systems Conference (MMSys’20), June 8–11, 2020, Istanbul, Turkey. , 4 pages. https://doi.org/10.1145/3339825.3393581

Thursday, May 6, 2021

ACM TOMM: Performance Analysis of ACTE: a Bandwidth Prediction Method for Low-Latency Chunked Streaming

Performance Analysis of ACTE: a Bandwidth Prediction Method for Low-Latency Chunked Streaming

ACM Transactions on Multimedia Computing, Communications, and Applications

[PDF]

Abdelhak Bentaleb (National University of Singapore), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Ali C. Begen (Ozyegin University, Networked Media), Roger Zimmermann (National University of Singapore)

Abstract: HTTP adaptive streaming with chunked transfer encoding can offer low-latency streaming without sacrificing the coding efficiency. This allows media segments to be delivered while still being packaged. However, conventional schemes often make widely inaccurate bandwidth measurements due to the presence of idle periods between the chunks, and hence this is causing sub-optimal adaptation decisions. To address this issue, we earlier proposed ACTE (ABR for Chunked Transfer Encoding), a bandwidth prediction scheme for low-latency chunked streaming. While ACTE was a significant step forward, in this study we focus on two still remaining open areas, namely (i) quantifying the impact of encoding parameters, including chunk and segment durations, bitrate levels, minimum interval between IDR-frames and frame rate on ACTE, and (ii) exploring the impact of video content complexity on ACTE. We thoroughly investigate these questions and report on our findings. We also discuss some additional issues that arise in the context of pursuing very low latency HTTP video streaming.

Keywords: HAS; ABR; DASH; CMAF; low-latency; HTTP chunked transfer encoding; bandwidth measurement and prediction; RLS; encoding parameters; FFmpeg

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.


QoMEX’20: Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming

 Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming

*** Best Paper Award ***

International Conference on Quality of Multimedia Experience (QoMEX)
May 26-28, 2020, Athlone, Ireland
http://qomex2020.ie/

[PDF][Slides][Video]

Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), Raimund Schatz (Alpen-Adria Universität Klagenfurt & AIT Austrian Institute of Technology, Austria)

Abstract: Volumetric media has the potential to provide the six degrees of freedom (6DoF) required by truly immersive media. However, achieving 6DoF requires ultra-high bandwidth transmissions, which real-world wide area networks cannot provide economically. Therefore, recent efforts have started to target efficient delivery of volumetric media, using a combination of compression and adaptive streaming techniques. It remains, however, unclear how the effects of such techniques on the user-perceived quality can be accurately evaluated. In this paper, we present the results of an extensive objective and subjective quality of experience (QoE) evaluation of volumetric 6DoF streaming. We use PCC-DASH, a standards-compliant means for HTTP adaptive streaming of scenes comprising multiple dynamic point cloud objects. By means of a thorough analysis, we investigate the perceived quality impact of the available bandwidth, rate adaptation algorithm, viewport prediction strategy, and user’s motion within the scene. We determine which of these aspects has more impact on the user’s QoE, and to what extent subjective and objective assessments are aligned.

Keywords: Volumetric Media; HTTP Adaptive Streaming; 6DoF; MPEG V-PCC; QoE Assessment; Objective Metrics

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Wednesday, May 5, 2021

ICME’20: Multi-Period Per-Scene Optimization for HTTP Adaptive Streaming

 Multi-Period Per-Scene Optimization for HTTP Adaptive Streaming

IEEE International Conference on Multimedia and Expo
July 06 – 10, London, United Kingdom
https://www.2020.ieeeicme.org/

[PDF][Slides][Video]

Venkata Phani Kumar M, Christian Timmerer and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria

Abstract: Video delivery over the Internet has become more and more established in recent years due to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The current DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of periods, adaptation sets, representations, and segments. Although multi-period MPDs are widely used in live streaming scenarios, they are not fully utilized in Video-on-Demand (VoD) HTTP adaptive streaming (HAS) scenarios. In this paper, we introduce MiPSO, a framework for MultiPeriod per-Scene Optimization, to examine multiple periods in VoD HAS scenarios. MiPSO provides different encoded representations of a video at either (i) maximum possible quality or (ii) minimum possible bitrate, beneficial to both service providers and subscribers. In each period, the proposed framework adjusts the video representations (resolution-bitrate pairs) by taking into account the complexities of the video content, with the aim of achieving streams at either higher qualities or lower bitrates. The experimental evaluation with a test video data set shows that the MiPSO reduces the average bitrate of streams with the same visual quality by approximately 10% or increases the visual quality of streams by at least 1 dB in terms of Peak Signal-to-Noise (PSNR) at the same bitrate compared to conventional approaches to video content delivery.

Keywords: Adaptive Streaming, Video-on-Demand, Per-Scene Encoding, Media Presentation Description

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.