Thursday, May 19, 2022

Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming


2022 NAB Broadcast Engineering and Information Technology (BEIT) Conference

April 24-26, 2022 | Las Vegas, US

Conference Website


Vignesh V Menon (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Christian Feldmann (Bitmovin, Klagenfurt)Adithyan Ilangovan (Bitmovin, Klagenfurt), Martin Smole (Bitmovin, Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: Current per-title encoding schemes encode the same video content at various bitrates and spatial resolutions to find optimal bitrate-resolution pairs (known as bitrate ladder) for each video content in Video on Demand (VoD) applications. But in live streaming applications, a fixed bitrate ladder is used for simplicity and efficiency to avoid the additional latency to find the optimized bitrate-resolution pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or network resources or/and (ii) increased Quality of Experience (QoE). In this paper, a fast and efficient per-title encoding scheme (Live-PSTR) is proposed tailor-made for live Ultra High Definition (UHD) High Framerate (HFR) streaming. It includes a pre-processing step in which Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features are used to determine the complexity of each video segment, based on which the optimized encoding resolution and framerate for streaming at every target bitrate is determined. Experimental results show that, on average, Live-PSTR yields bitrate savings of 9.46% and 11.99% to maintain the same PSNR and VMAF scores, respectively compared to the HTTP Live Streaming (HLS) bitrate ladder.

Tuesday, April 26, 2022

INCEPT: INTRA CU Depth Prediction for HEVC

 IEEE 23rd International Workshop on Multimedia Signal Processing

October 06–08, 2021, Tampere, Finland

Vignesh V Menon, Hadi Amirpour, Christian Timmerer, and Mohammad Ghanbari
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt


High Efficiency Video Coding (HEVC) improves the encoding efficiency by utilizing sophisticated tools such as flexible Coding Tree Unit (CTU) partitioning. The Coding Unit (CU) can be split recursively into four equally sized CUs ranging from 64x64 to 8x8 pixels. At each depth level (or CU size), intra prediction via exhaustive mode search was exploited in HEVC to improve the encoding efficiency and result in a very high encoding time complexity. This paper proposes an Intra CU Depth Prediction (INCEPT) algorithm, which limits Rate-Distortion Optimization (RDO) for each CTU in HEVC by utilizing the spatial correlation with the neighboring CTUs, which is computed using a DCT energy-based feature. Thus, INCEPT reduces the number of candidate CU sizes required to be considered for each CTU in HEVC intra coding. Experimental results show that the INCEPT algorithm achieves a better trade-off between the encoding efficiency and encoding time saving (i.e., BDR/∆T) than the benchmark algorithms. While BDR/∆T is 12.35% and 9.03% for the benchmark algorithms, it is 5.49% for the proposed algorithm. As a result, INCEPT achieves a 23.34% reduction in encoding time on average while incurring only a 1.67% increase in bit rate than the original coding in the x265 HEVC open-source encoder.

Keywords: HEVC, Intra coding, CTU, CU, depth decision


Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA:

Friday, April 22, 2022

Super-resolution Based Bitrate Adaptation for HTTP Adaptive Streaming for Mobile Devices

ACM Mile-High Video Conference 2022 (MHV)

March 01-03, 2022 | Denver, CO, USA

Minh NguyenEkrem Çetinkaya, Hermann Hellwagner, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: The advancement of mobile hardware in recent years made it possible to apply deep neural network (DNN) based approaches on mobile devices. This paper introduces a lightweight super-resolution (SR) network, namely SR-ABR Net, deployed at mobile devices to upgrade low-resolution/low-quality videos and a novel adaptive bitrate (ABR) algorithm, namely WISH-SR, that leverages SR networks at the client to improve the video quality depending on the client’s context. WISH-SR takes into account mobile device properties, video characteristics, and user preferences. Experimental results show that the proposed SR-ABR Net can improve the video quality compared to traditional SR approaches while running in real-time. Moreover, the proposed WISH-SR can significantly boost the visual quality of the delivered content while reducing both bandwidth consumption and the number of stalling events.

Keywords: Super-resolution, Deep Neural Networks, Mobile Devices, ABR

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA:

Tuesday, April 19, 2022

Take the Red Pill for H3 and See How Deep the Rabbit Hole Goes

ACM Mile-High video Conference 2022 (MHV)

March 01-03, 2022 | Denver, CO, USA

Conference Website


Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Stefan Pham (Fraunhofer FOKUS, Germany), Daniel Silhavy (Fraunhofer FOKUS, Germany), Ali C. Begen (Ozyegin University, Turkey)

Abstract: With the introduction of HTTP/3 (H3) and QUIC at its core, there is an expectation of significant improvements in Web-based secure object delivery. As HTTP is a central protocol to the current adaptive streaming methods in all major over-the-top (OTT) services, an important question is what H3 will bring to the table for such services. To answer this question, we present the new features of H3 and QUIC, and compare them to those of H/1.1/2 and TCP. We also share the latest research findings in this domain.

Keywords: HTTP adaptive streaming, QUIC, CDN, ABR, OTT, DASH, HLS.

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA:

Friday, April 15, 2022

QoCoVi: QoE- and Cost-Aware Adaptive Video Streaming for the Internet of Vehicles

Elsevier Computer Communications journal 


Alireza Erfanian, Farzad Tashtarian, Christian Timmerer, and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Recent advances in embedded systems and communication technologies enable novel, non-safety applications in Vehicular Ad Hoc Networks (VANETs). Video streaming has become a popular core service for such applications. In this paper, we present QoCoVi as a QoE- and cost-aware adaptive video streaming approach for the Internet of Vehicles (IoV) to deliver video segments requested by mobile users at specified qualities and deadlines. Considering a multitude of transmission data sources with different capacities and costs, the goal of QoCoVi is to serve the desired video qualities with minimum costs. By applying Dynamic Adaptive Streaming over HTTP (DASH) principles, QoCoVi considers cached video segments on vehicles equipped with storage capacity as the lowest-cost sources for serving requests.

We design QoCoVi in two SDN-based operational modes: (i) centralized and (ii) distributed. In centralized mode, we can obtain a suitable solution by introducing a mixed-integer linear programming (MILP) optimization model that can be executed on the SDN controller. However, to cope with the computational overhead of the centralized approach in real IoV scenarios, we propose a fully distributed version of QoCoVi based on the proximal Jacobi alternating direction method of multipliers (ProxJ-ADMM) technique. The effectiveness of the proposed approach is confirmed through emulation with Mininet-WiFi in different scenarios.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA:

Wednesday, March 9, 2022

Video Complexity Analyzer (VCA) v1.0

Release of Video Complexity Analyzer (VCA) version 1.0 open-source software.

The primary objective of VCA is to become the best spatial (E) and temporal (h) complexity predictor for every frame/video segment/video sequence, which aids in predicting encoding parameters for applications like scene-cut detection and online per-title encoding. VCA leverages x86 SIMD and multi-threading optimizations for effective performance. While VCA is primarily designed as a video complexity analyzer library, a command-line executable is provided to facilitate testing and development. We expect VCA to be utilized in many leading video encoding solutions in the coming years.

VCA is available as an open-source library, published under the GPLv3 license. For more details, please visit the online software documentation here. The source code can be found here.

A heatmap of the spatial (E) and temporal (h) complexity is shown below.

Heatmap of spatial complexity (E)

Heatmap of temporal complexity (h)










A performance comparison (fps) of VCA (with different levels of threading enabled) compared to Spatial Information/Temporal Information (SITI) is shown below.

Further information about possible VCA applications can be found at, e.g. (list to-be-continued;),

Sunday, February 27, 2022

MPEG news: a report from the 137th meeting

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will also be posted at ACM SIGMM Records.
MPEG News Archive

The 137th MPEG meeting was once again held as an online meeting, and the official press release can be found here and comprises the following items:
  • MPEG Systems Wins Two More Technology & Engineering Emmy® Awards
  • MPEG Audio Coding selects 6DoF Technology for MPEG-I Immersive Audio
  • MPEG Requirements issues Call for Proposals for Encoder and Packager Synchronization
  • MPEG Systems promotes MPEG-I Scene Description to the Final Stage
  • MPEG Systems promotes Smart Contracts for Media to the Final Stage
  • MPEG Systems further enhanced the ISOBMFF Standard
  • MPEG Video Coding completes Conformance and Reference Software for LCEVC
  • MPEG Video Coding issues Committee Draft of Conformance and Reference Software for MPEG Immersive Video
  • JVET produces Second Editions of VVC & VSEI and finalizes VVC Reference Software
  • JVET promotes Tenth Edition of AVC to Final Draft International Standard
  • JVET extends HEVC for High-Capability Applications up to 16K and Beyond
  • MPEG Genomic Coding evaluated Responses on New Advanced Genomics Features and Technologies
  • MPEG White Papers
    • Neural Network Coding (NNC)
    • Low Complexity Enhancement Video Coding (LCEVC)
    • MPEG Immersive video
In this column, I’d like to focus on the Emmy® Awards, video coding updates (AVC, HEVC, VVC, and beyond), and a brief update about DASH (as usual).

MPEG Systems Wins Two More Technology & Engineering Emmy® Awards

MPEG Systems is pleased to report that MPEG is being recognized this year by the National Academy for Television Arts and Sciences (NATAS) with two Technology & Engineering Emmy® Awards, for (i) “standardization of font technology for custom downloadable fonts and typography for Web and TV devices and for (ii) “standardization of HTTP encapsulated protocols”, respectively.

The first of these Emmys is related to MPEG’s Open Font Format (ISO/IEC 14496-22) and the second of these Emmys is related to MPEG Dynamic Adaptive Streaming over HTTP (i.e., MPEG DASH, ISO/IEC 23009). The MPEG DASH standard is the only commercially deployed international standard technology for media streaming over HTTP and it is widely used in many products. MPEG developed the first edition of the DASH standard in 2012 in collaboration with 3GPP and since then has produced four more editions amending the core specification by adding new features and extended functionality. Furthermore, MPEG has developed six other standards as additional “parts” of ISO/IEC 23009 enabling the effective use of the MPEG DASH standards with reference software and conformance testing tools, guidelines, and enhancements for additional deployment scenarios. MPEG DASH has dramatically changed the streaming industry by providing a standard that is widely adopted by various consortia such as 3GPP, ATSC, DVB, and HbbTV, and across different sectors. The success of this standard is due to its technical excellence, large participation of the industry in its development, addressing the market needs, and working with all sectors of industry all under ISO/IEC JTC 1/SC 29 MPEG Systems’ standard development practices and leadership.

These are MPEG’s fifth and sixth Technology & Engineering Emmy® Awards (after MPEG-1 and MPEG-2 together with JPEG in 1996, Advanced Video Coding (AVC) in 2008, MPEG-2 Transport Stream in 2013, and ISO Base Media File Format in 2021) and MPEG’s seventh and eighth overall Emmy® Awards (including the Primetime Engineering Emmy® Awards for Advanced Video Coding (AVC) High Profile in 2008 and High-Efficiency Video Coding (HEVC) in 2017).

I have been actively contributing to the MPEG DASH standard since its inception. My initial blog post dates back to 2010 and the first edition of MPEG DASH was published in 2012. A more detailed MPEG DASH timeline provides many pointers to the Institute of Information Technology (ITEC) at the Alpen-Adria-Universität Klagenfurt and its DASH activities that is now continued within the Christian Doppler Laboratory ATHENA. In the end, the MPEG DASH community of contributors to and users of the standards can be very proud of this achievement only after 10 years of the first edition being published. Thus, also happy 10th birthday MPEG DASH and what a nice birthday gift.

Video Coding Updates

In terms of video coding, there have been many updates across various standards’ projects at the 137th MPEG Meeting.

Advanced Video Coding

Starting with Advanced Video Coding (AVC), the 10th edition of Advanced Video Coding (AVC, ISO/IEC 14496-10 | ITU-T H.264) has been promoted to Final Draft International Standard (FDIS) which is the final stage of the standardization process. Beyond various text improvements, this specifies a new SEI message for describing the shutter interval applied during video capture. This can be variable in video cameras, and conveying this information can be valuable for analysis and post-processing of the decoded video.

High-Efficiency Video Coding

The High-Efficiency Video Coding (HEVC, ISO/IEC 23008-2 | ITU-T H.265) standard has been extended to support high-capability applications. It defines new levels and tiers providing support for very high bit rates and video resolutions up to 16K, as well as defining an unconstrained level. This will enable the usage of HEVC in new application domains, including professional, scientific, and medical video sectors.

Versatile Video Coding

The second editions of Versatile Video Coding (VVC, ISO/IEC 23090-3 | ITU-T H.266) and Versatile supplemental enhancement information messages for coded video bitstreams (VSEI, ISO/IEC 23002-7 | ITU-T H.274) have reached FDIS status. The new VVC version defines profiles and levels supporting larger bit depths (up to 16 bits), including some low-level coding tool modifications to obtain improved compression efficiency with high bit-depth video at high bit rates. VSEI version 2 adds SEI messages giving additional support for scalability, multi-view, display adaptation, improved stream access, and other use cases. Furthermore, a Committee Draft Amendment (CDAM) for the next amendment of VVC was issued to begin the formal approval process to enable linking VVC with the Green Metadata (ISO/IEC 23001-11) and Video Decoding Interface (ISO/IEC 23090-13) standards and add a new unconstrained level for exceptionally high capability applications such as certain uses in professional, scientific, and medical application scenarios. Finally, the reference software package for VVC (ISO/IEC 23090-16) was also completed with its achievement of FDIS status. Reference software is extremely helpful for developers of VVC devices, helping them in testing their implementations for conformance to the video coding specification.

Beyond VVC

The activities in terms of video coding beyond VVC capabilities, the Enhanced Compression Model (ECM 3.1) performance over VTM-11.0 + JVET-V0056 (i.e., VVC reference software) shows an improvement of close to 15% for Random Access Main 10. This is indeed encouraging and, in general, these activities are currently managed within two exploration experiments (EEs). The first is on neural network-based (NN) video coding technology (EE1) and the second is on enhanced compression beyond VVC capability (EE2). EE1 currently plans to further investigate (i) enhancement filters (loop and post) and (ii) super-resolution (JVET-Y2023). It will further investigate selected NN technologies on top of ECM 4 and the implementation of selected NN technologies in the software library, for platform-independent cross-checking and integerization. Enhanced Compression Model 4 (ECM 4) comprises new elements on MRL for intra, various GPM/affine/MV-coding improvements including TM, adaptive intra MTS, coefficient sign prediction, CCSAO improvements, bug fixes, and encoder improvements (JVET-Y2025). EE2 will investigate intra prediction improvements, inter prediction improvements, improved screen content tools, and improved entropy coding (JVET-Y2024).

Research aspects: video coding performance is usually assessed in terms of compression efficiency or/and encoding runtime (time complexity). Another aspect is related to visual quality, its assessment, and metrics, specifically for neural network-based video coding technologies.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 137th MPEG meeting, MPEG Systems issued a draft amendment to the core MPEG-DASH specification (i.e., ISO/IEC 23009-1) about Extended Dependent Random Access Point (EDRAP) streaming and other extensions which it will be further discussed during the Ad-hoc Group (AhG) period (please join the dash email list for further details/announcements). Furthermore, Defects under Investigation (DuI) and Technologies under Consideration (TuC) are available here.

An updated overview of DASH standards/features can be found in the Figure below.

Research aspects: in the Christian Doppler Laboratory ATHENA we aim to research and develop novel paradigms, approaches, (prototype) tools and evaluation results for the phases (i) multimedia content provisioning (i.e., video coding), (ii) content delivery (i.e., video networking), and (iii) content consumption (i.e., video player incl. ABR and QoE) in the media delivery chain as well as for (iv) end-to-end aspects, with a focus on, but not being limited to, HTTP Adaptive Streaming (HAS).

The 138th MPEG meeting will be again an online meeting in July 2022. Click here for more information about MPEG meetings and their developments.