Thursday, June 10, 2021

IEEE VCIP 2021 Special Sessions

IEEE VCIP 2021 Special Sessions

http://www.vcip2021.org/

Submission of Papers for Regular, Demo, and Special Sessions (extended): June 27, 2021
Paper Acceptance Notification: August 30, 2021

Title: Learning-based Image and Video Coding

Organizers: João Ascenso (Instituto Superior Técnico), Elena Alshina (Huawei)

Description: Image and video coding algorithms create compact representations of an image by exploiting its spatial redundancy and perceptual irrelevance, thus exploiting the characteristics of the human visual system. Recently, data-driven algorithms such as neural networks have attracted a lot of attention and become a popular area of research and development. This interest is driven by several factors, such as recent advances in processing power (cheap and powerful hardware), the availability of large data sets (big data), and several algorithmic and architectural advances (e.g. generative adversarial networks).

Nowadays, neural networks are the state-of-the-art for several computer vision tasks, such as those requiring a high-level understanding of image semantics, e.g. image classification, object segmentation, saliency detection, but also low-level image processing tasks, such as image denoising, inpainting, and super-resolution. These advances have led to an increased interest in applying deep neural networks to image and video coding, which is now the main focus of the JPEG AI and the JVET NN activities within the JPEG and MPEG standardization committees.

The aim of these novel image and video coding solutions is to design a compact representation model that has been obtained (learned) from a large amount of visual data and can efficiently represent the wide variety of images and videos that are consumed today. Some of the available learning-based image coding solutions already show very promising experimental results in terms of rate-distortion (RD) performance, notably in comparison with conventional standard image codecs (especially HEVC Intra and VVC Intra) which code the image data with hand-crafted transforms, entropy coding, and quantization schemes.

This special session on Learning-based Image and Video Coding gathers technical contributions that demonstrate the efficient coding of image and video content based on a learning-based approach. This topic has received many contributions in recent years and is considered critical for the future of both image and video coding, especially solutions adopting end-to-end training as well as for solutions where learning-based tools replace previous conventional tools.

Monday, June 7, 2021

IEEE TNSM: OSCAR: On Optimizing Resource Utilization in Live Video Streaming

 OSCAR: On Optimizing Resource Utilization in Live Video Streaming

[PDF]
DOI: 10.1109/TNSM.2021.3051950

Alireza Erfanian, Farzad Tashtarian, Anatoliy Zabrovskiy, Christian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Live video streaming traffic and related applications have experienced significant growth in recent years. However, this has been accompanied by some challenging issues, especially in terms of resource utilization. Although IP multicasting can be recognized as an efficient mechanism to cope with these challenges, it suffers from many problems. Applying software-defined networking (SDN) and network function virtualization (NFV) technologies enable researchers to cope with IP multicasting issues in novel ways. In this paper, by leveraging the SDN concept, we introduce OSCAR (Optimizing reSourCe utilizAtion in live video stReaming) as a new cost-aware video streaming approach to provide advanced video coding (AVC)-based live streaming services in the network. In this paper, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder function (VTF). At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller.  Then, by executing a mixed-integer linear program (MILP), the SDN controller determines a group of optimal multicast trees for streaming the requested videos from an appropriate origin server to the VRPs. Moreover, to elevate the efficiency of resource allocation and meet the given end-to-end latency threshold, OSCAR delivers only the highest requested quality from the origin server to an optimal group of VTFs over a multicast tree. The selected VTFs then transcode the received video segments and transmit them to the requesting VRPs in a multicast fashion. To mitigate the time complexity of the proposed MILP model, we present a simple and efficient heuristic algorithm that determines a near-optimal solution in polynomial time. Using the MiniNet emulator, we evaluate the performance of OSCAR in various scenarios. The results show that OSCAR surpasses other SVC- and AVC-based multicast and unicast approaches in terms of cost and resource utilization.

Keywords: Dynamic Adaptive Streaming over HTTP (DASH), Live Video Streaming, Software Defined Networking (SDN), Video Transcoding, Network Function Virtualization (NFV).

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Wednesday, June 2, 2021

ISM’20: Dynamic Segment Repackaging at the Edge for HTTP Adaptive Streaming

Dynamic Segment Repackaging at the Edge for HTTP Adaptive Streaming

IEEE International Symposium on Multimedia (ISM)
2-4 December 2020, Naples, Italy

Jesús Aguilar Armijo, Babak Taraghi, Christian Timmerer, and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Adaptive video streaming systems typically support different media delivery formats, e.g., MPEG-DASH and HLS, replicating the same content multiple times into the network. Such a diversified system results in inefficient use of storage, caching, and bandwidth resources. The Common Media Application Format (CMAF) emerges to simplify HTTP Adaptive Streaming (HAS), providing a single encoding and packaging format of segmented media content and offering the opportunities of bandwidth savings, more cache hits, and less storage needed. However, CMAF is not yet supported by most devices. To solve this issue, we present a solution where we maintain the main advantages of CMAF while supporting heterogeneous devices using different media delivery formats. For that purpose, we propose to dynamically convert the content from CMAF to the desired media delivery format at an edge node. We study the bandwidth savings with our proposed approach using an analytical model and simulation, resulting in bandwidth savings of up to 20% with different media delivery format distributions.
We analyze the runtime impact of the required operations on the segmented content performed in two scenarios: the classic one, with four different media delivery formats, and the proposed scenario, using CMAF-only delivery through the network. We compare both scenarios with different edge compute power assumptions. Finally, we perform experiments in a real video streaming testbed delivering MPEG-DASH using CMAF content to serve a DASH and an HLS client, performing the media conversion for the latter one.

Keywords: CMAF, Edge Computing, HTTP Adaptive Streaming (HAS)

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Thursday, May 27, 2021

IEEE Communication Magazine: From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

Teaser: “Help me, Obi-Wan Kenobi. You’re my only hope,” said the hologram of Princess Leia in Star Wars: Episode IV – A New Hope (1977). This was the first time in cinematic history that the concept of holographic-type communication was illustrated. Almost five decades later, technological advancements are quickly moving this type of communication from science fiction to reality.

IEEE Communication Magazine

[PDF]

Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Tim Wauters (Ghent University), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), and Raimund Schatz (AIT Austrian Institute of Technology)

Abstract: Technological improvements are rapidly advancing holographic-type content distribution. Significant research efforts have been made to meet the low-latency and high-bandwidth requirements set forward by interactive applications such as remote surgery and virtual reality. Recent research made six degrees of freedom (6DoF) for immersive media possible, where users may both move their heads and change their position within a scene. In this article, we present the status and challenges of 6DoF applications based on volumetric media, focusing on the key aspects required to deliver such services. Furthermore, we present results from a subjective study to highlight relevant directions for future research.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Saturday, May 22, 2021

VCIP’20: FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

IEEE International Conference on Visual Communications and Image Processing (VCIP)
1-4 December 2020, Macau

Ekrem Çetinkaya, Hadi Amirpour, Christian Timmerer, and Mohammad Ghanbari
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: HTTP Adaptive Streaming(HAS) is the most common approach for delivering video content over the Internet. The requirement to encode the same content at different quality levels (i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we propose to use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that FaME-ML achieves significant time-complexity savings in parallel encoding scenarios(41%in average) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, HTTP Adaptive Streaming (HAS)

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

ACM Multimedia’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM MM’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM International Conference on Multimedia 2020, Seattle, United States.
https://2020.acmmm.org

[PDF][Slides][Video]

Negin Ghamsarian (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Mario Taschwer (Alpen-Adria-Universität Klagenfurt), and Klaus Schöffmann (Alpen-Adria-Universität Klagenfurt)

Abstract: Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, ROI Detection, Medical Multimedia.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Friday, May 21, 2021

EPIQ’20: Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC Using Retransmission

Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC Using Retransmission

ACM SIGCOMM 2020 Workshop on Evolution, Performance, and Interoperability of QUIC (EPIQ 2020)
August 10–14, 2020
https://conferences.sigcomm.org/sigcomm/2020/workshop-epiq.html

[PDF][Slides][Video]

Minh Nguyen, Hadi Amirpour, Christian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: HTTP/2 has been explored widely for video streaming, but still suffers from Head-of-Line blocking, and three-way hand-shake delay due to TCP. Meanwhile, QUIC running on top of UDP can tackle these issues. In addition, although many adaptive bitrate (ABR) algorithms have been proposed for scalable and non-scalable video streaming, the literature lacks an algorithm designed for both types of video streaming approaches. In this paper, we investigate the impact of quick and HTTP/2 on the performance of adaptive bitrate(ABR) algorithms in terms of different metrics. Moreover, we propose an efficient approach for utilizing scalable video coding formats for adaptive video streaming that combines a traditional video streaming approach (based on non-scalable video coding formats) and a retransmission technique. The experimental results show that QUIC benefits significantly from our proposed method in the context of packet loss and retransmission. Compared to HTTP/2, it improves the average video quality and also provides a smoother adaptation behavior. Finally, we demonstrate that our proposed method originally designed for non-scalable video codecs also works efficiently for scalable videos such as Scalable High EfficiencyVideo Coding(SHVC).

Keywords: QUIC, H2BR, HTTP adaptive streaming, Retransmission, SHVC

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.