Sunday, December 23, 2018

What happened in multimedia communication in 2018?

In January 2018 I wrote a blog post entitled "What to care about in multimedia communication in 2018?" and I think it's worth looking back to see what actually happened with respect to next generation video coding formats and adaptive streaming techniques.

In April 2018, the responses to the call for proposals for the next standard in video compression have been evaluated and a first working draft and test model for the Versatile Video Coding (VVC) standard have been approved. At this point already, some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Currently, working draft 3 and test model 3 of VVC (VTM 3) are available and we may certainly expect compression efficiency gains well-beyond the targeted 50% for the final standard. An overview about VVC can be found here (by C. Feldmann) and here (by M. Wien). The licensing issues have been acknowledged and, thus, the Media Coding Industry Forum (MC-IF) has been established.

At the beginning of 2018, everyone was also very curious about AOMedia and AV1. Version 1 of the specification has been finally become available and in the meantime it is implemented/deployed on both content provisioning/encoding (e.g. Bitmovin) and content consumption/decoding (e.g., Chrome, Firefox). In this context, we also published a multi-codec DASH dataset comprising AVC, HEVC, VP9, and AV1 (VVC will be added at a later stage). In general, however, we are entering the era of multiple video codecs deployed in products and services whereby this trend is also confirmed by Bitmovin's latest video developer survey.

MPEG-DASH 3rd edition has been approved and is awaiting publication but I expect this to happen in 2019 though. An overview of the MPEG-DASH status is shown in the figure below.
In this context, the DASH-IF produced various vital assets such as interoperability guidelines (latest v4.3, content protection, ATSC 3.0, SAND), test vectors, conformance tools, and a reference client. For informative aspects of MPEG-DASH such as the bitrate adaptation schemes the interested reader is referred to our survey. This survey gives an overview about existing techniques (see figure below) and also outlines future research. It is available for free for everyone (open access).

Finally, I mentioned a couple of scientific events in 2018 including QoMEX, MMSys (NOSSDAV, PV), ICME, ICIP, PCS, and MIPR. I have attended all of the them (except PCS), each showing advances in their respective field. These events are probably worth to attend also in 2019 but I will certainly blog about this early next year. However, I'd like to hear your opinion of what happened in 2018 and what we may expect in 2019...

Friday, December 14, 2018

Christian Doppler Research Association approves ATHENA project proposal

ATHENA stands for Adaptive Streaming over HTTP and Emerging Networked Multimedia Services and has been jointly proposed by the Institute of Information Technology (ITEC; at Alpen-Adria-Universität Klagenfurt (AAU) and Bitmovin GmbH ( to address current and future research and deployment challenges of HTTP adaptive steaming (HAS) and emerging streaming methods.

AAU (ITEC) has been working on adaptive video streaming for more than a decade, has a proven record of successful research projects and publications in the field, and has been actively contributing to MPEG standardization for many years, including MPEG-DASH; Bitmovin is a video streaming software company founded by ITEC researchers in 2013 and has developed highly successful, global R&D and sales activities and a world-wide customer base since then. 

The aim of ATHENA is to research and develop novel paradigms, approaches, (prototype) tools and evaluation results for the areas (1) multimedia content provisioning (i.e., video coding), (2) content delivery (i.e., multimedia networking) and (3) content consumption (i.e., HAS player aspects) in the media delivery chain as well as for (4) end-to-end aspects, with a focus on, but not being limited to, HTTP Adaptive Streaming (HAS). The new approaches and insights are to enable Bitmovin to build innovative applications and services to account for the steadily increasing and changing multimedia traffic on the Internet.

The project has been approved by the Christian Doppler Research Association as a CD pilot laboratory -- the first such kind of project at Alpen-Adria-Universität Klagenfurt -- with a duration of two years including a five year extension after successful review after the first two years (i.e., seven years in total). Thus, stay tuned for details and yes, I'm hiring PhD students for the areas above (detailed job description will be published soon).

Tuesday, December 11, 2018

Future of Video Codec Licensing: Avoiding the Tragedy of the Commons

Media Coding Industry Forum (MC-IF) announces workshop on codec patent licensing:

Future of Video Codec Licensing:
Avoiding the tragedy of the commons

January 7th 2019, Sunnyvale, California
Admission is complimentary, but Registration is Required 

Are you worried about the future of media codec licensing? Would you like to find out more about ideas and initiatives to create an effective patent licensing landscape for media technologies? Are you interested to hear about the business needs for more efficient data compression methods and how such methods can be brought to market?

Join the Media Coding Industry Forum (MC-IF) and its members at an open workshop: "Future of Video Codec Licensing: avoiding the tragedy of the commons" with a reference to the concept that individual actions in a group, while individually rational or even optimal, might result in an outcome that is far from optimal for any member, and generally undesirable for the group. How can such a situation be avoided? The workshop will engage attendees in a frank and open discussion of the needs, desires, and issues, and include speakers from major players in patent pool licensing, implementers, licensors, broadcasting, and delivery, covering the entire video compression ecosystem.

The workshop will follow an open meeting and consist of panel sessions and discussion as follows:

10am-1pm Open meeting, incl. Lunch

1-2pm Industry Needs and Opportunities for the Ecosystem
  • Ben Waggoner (Amazon)
  • Michael Robinson (AT&T)
  • Lynn Comp (Intel)
  • Jonatan Samuelsson (Divideon)
  • Mod: Jan Ozer (Streaming Learning Center) 
2:30-3:30 Roadblocks, Impediments - and Bulldozers
  • Stephan Wenger (Tencent)
  • Tom Vaughn (Beamr)
  • Stefan Lederer (Bitmovin)
  • Jeremy Rosenberg (Harmonic)
  • Mod: Shawn Ambwani (Unified Patents) 
4-5pm View of Licensors
  • Larry Horn (MPEG-LA)
  • Hasan Rashid (HEVC Advance)
  • Greg Weiss (Velos Media)
  • Robert Gray (Nokia)
  • Mod: Brian Love (Santa Clara Univ Law School)
5pm Wrap-up Discussion

5:30-6:30 Reception
The workshop is open to anyone, but we are asking the press not to attend the workshop. It will operate under the Chatham House Rule1. Please note that this is primarily an ecosystem event, not a technology event. Attendance is particularly encouraged from those in the licensee-licensor relationship, and from those building businesses that use licensed media standards.

During the same day, January 7th, there will be an open meeting of MC-IF from 10AM to 1PM, including lunch, where information about the forum will be presented and participants will be able to interact with representatives from MC-IF, to ask questions and provide feedback. The workshop follows lunch (provided), from 1:00PM to 5:30PM PST. The day ends with a reception (starting at around 5:30PM).

Both events are open to everyone and free of charge; anti-trust counsel will be present at both. Seating is limited so please register as soon as possible! Pre-registration is required.

Location: CableLabs, 400 W California Ave, Sunnyvale, CA 94086

The Media Coding Industry Forum was formed in 2018 to specifically focus on non-technical aspects of media coding standard deployment with an initial focus on the Versatile Video Coding (VVC) standard that is under development in a joint effort of ISO/IEC and ITU-T.

MC-IF is pleased with its continued rapid membership growth. Apple, Ericsson, Intel, Nokia, Sony, Tencent, and many others have all joined the MC-IF to collectively search for improvements—for all parties—in the media codec licensing ecosystem. To become a part of this important effort and join MC-IF, go to

Tuesday, October 30, 2018

MPEG news: a report from the 124th meeting, Macau, China

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.

The MPEG press release comprises the following aspects:
  • Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage
  • Compressed Representation of Neural Networks - MPEG issues Call for Proposals
  • Low Complexity Video Coding Enhancements - MPEG issues Call for Proposals
  • New Video Coding Standard expected to have licensing terms timely available - MPEG issues Call for Proposals
  • Multi-Image Application Format (MIAF) promoted to Final Draft International Standard
  • 3DoF+ Draft Call for Proposal goes Public

Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage

At its 124th meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to Committee Draft (CD) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colour. By leveraging existing and video ecosystems in general (hardware acceleration, transmission services and infrastructure), and future video codecs as well, the V-PCC technology enables new applications. The current V-PCC encoder implementation provides a compression of 125:1, which means that a dynamic point cloud of 1 million points could be encoded at 8 Mbit/s with good perceptual quality.

A next step is the storage of V-PCC in ISOBMFF for which a working draft has been produced. It is expected that further details will be discussed in upcoming reports.
Research aspects: Video-based Point Cloud Compression (V-PCC) is at CD stage and a first working draft for the storage of V-PCC in ISOBMFF has been provided. Thus, a next consequence is the delivery of V-PCC encapsulated in ISOBMFF over networks utilizing various approaches, protocols, and tools. Additionally, one may think of using also different encapsulation formats if needed. I hope to see some of these aspects covered in future conferences including those -- but not limited to -- listed at the very end of this blog post.

MPEG issues Call for Proposals on Compressed Representation of Neural Networks

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, media coding, data analytics, and many other fields. Their recent success is based on the feasibility of processing much larger and complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. Some applications require the deployment of a particular trained network instance to a potentially large number of devices and, thus, could benefit from a standard for the compressed representation of neural networks. Therefore, MPEG has issued a Call for Proposals (CfP) for compression technology for neural networks, focusing on the compression of parameters and weights, focusing on four use cases: (i) visual object classification, (ii) audio classification, (iii) visual feature extraction (as used in MPEG CDVA), and (iv) video coding.
Research aspects: As point out last time, research here will mainly focus around compression efficiency for both lossy and lossless scenarios. Additionally, communication aspects such as transmission of compressed artificial neural networks within lossy, large-scale environments including update mechanisms may become relevant in the (near) future.

MPEG issues Call for Proposals on Low Complexity Video Coding Enhancements

Upon request from the industry, MPEG has identified an area of interest in which video technology deployed in the market (e.g., AVC, HEVC) can be enhanced in terms of video quality without the need to necessarily replace existing hardware. Therefore, MPEG has issued a Call for Proposals (CfP) on Low Complexity Video Coding Enhancements.

The objective is to develop video coding technology with a data stream structure defined by two component streams: a base stream decodable by a hardware decoder and an enhancement stream suitable for software processing implementation. The project is meant to be codec agnostic; in other words, the base encoder and base decoder can be AVC, HEVC, or any other codec in the market.
Research aspects: The interesting aspect here is that this use case assumes a legacy base decoder - most likely realized in hardware - which is enhanced with software-based implementations to improve coding efficiency or/and quality without sacrificing capabilities of the end user in terms of complexity and, thus, energy efficiency due to the software based solution.

MPEG issues Call for Proposals for a New Video Coding Standard expected to have licensing terms timely available

At its 124th meeting, MPEG issued a Call for Proposals (CfP) for a new video coding standard to address combinations of both technical and application (i.e., business) requirements that may not be adequately met by existing standards. The aim is to provide a standardized video compression solution which combines coding efficiency similar to that of HEVC with a level of complexity suitable for real-time encoding/decoding and the timely availability of licensing terms.
Research aspects: This new work item is more related to business aspects (i.e., licensing terms) than technical aspects of video coding. As this blog is about technical aspects and I'm also not an expert in licensing terms, I do not comment on this any further.

Multi-Image Application Format (MIAF) promoted to Final Draft International Standard

The Multi-Image Application Format (MIAF) defines interoperability points for creation, reading, parsing, and decoding of images embedded in High Efficiency Image File (HEIF) format by (i) only defining additional constraints on the HEIF format, (ii) limiting the supported encoding types to a set of specific profiles and levels, (iii) requiring specific metadata formats, and (iv) defining a set of brands for signaling such constraints including specific depth map and alpha plane formats. For instance, it addresses use case like a capturing device may use one of HEIF codecs with a specific HEVC profile and level in its created HEIF files, while a playback device is only capable of decoding the AVC bitstreams.
Research aspects: MIAF is an application format which is defined as a combination of tools (incl. profiles and levels) of other standards (e.g., audio codecs, video codecs, systems) to address the needs of a specific application. Thus, the research is related to use cases enabled by this application format.

3DoF+ Draft Call for Proposal goes Public

Following investigations on the coding of “three Degrees of Freedom plus” (3DoF+) content in the context of MPEG-I, the MPEG video subgroup has provided evidence demonstrating the capability to encode a 3DoF+ content efficiently while maintaining compatibility with legacy HEVC hardware. As a result, MPEG decided to issue a draft Call for Proposal (CfP) to the public containing the information necessary to prepare for the final Call for Proposal expected to occur at the 125th MPEG meeting (January 2019) with responses due at the 126th MPEG meeting (March 2019).
Research aspects: This work item is about video (coding) and, thus, research is about compression efficiency.

What else happened at #MPEG124?

  • MPEG-DASH 3rd edition is still in the final editing phase and not yet available. Last time, I wrote that we expect final publication later this year or early next year and we hope this is still the case. At this meeting Amendment.5 is progressed to DAM and conformance/reference software for SRD, SAND and Server Push is also promoted to DAM. In other words, DASH is pretty much in maintenance mode.
  • MPEG-I (systems part) is working on immersive media access and delivery and I guess more updates will come on this after the next meeting. OMAF is working on a 2nd edition for which a working draft exists and phase 2 use cases (public document) and draft requirements are discussed.
  • Versatile Video Coding (VVC): working draft 3 (WD3) and test model 3 (VTM3) has been issued at this meeting including a large number of new tools. Both documents (and software) will be publicly available after editing periods (Nov. 23 for WD3 and Dec 14 for VTM3). JVET documents are publicly available here

Last but not least, some ads...

Tuesday, October 9, 2018

AAU and Bitmovin presenting IEEE ICIP 2018

The IEEE International Conference in Image Processing (ICIP) is with more than 1,000 attendees one of the biggest conferences of the Signal Processing Society. At ICIP'18Anatoliy (AAU) and myself (AAU/Bitmovin) attended with the following presentations:

On Monday, October 8, I was on the panel of the Young Professional Networking Event (together with Amy Reibman and Sheila Hemami) sharing my experiences with all attendees. See one picture here.

On Tuesday, October 9, I presented at the Innovation Program talking about "Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of the Art and Challenges Ahead".

On Wednesday, October 10, Anatoliy presented our joint AAU/Bitmovin paper about "A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services". Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audio- visual content accounts for the majority of today’s internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs. This paper provides a practical evaluation of state of the art video codecs (i.e., AV1, AVC/libx264, HEVC/libx265, VP9/libvpx-vp9) for large- scale HTTP adaptive streaming services. In anticipation of the results, AV1 shows promising performance compared to established video codecs. Additionally, AV1 is intended to be royalty free making it worthwhile to be considered for large scale HTTP adaptive streaming services.

A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services from Christian Timmerer

Acknowledgment: This work was supported in part by the Austrian Research Promotion Agency (FFG) under the Next Generation Video Streaming project “PROMETHEUS”.

Tuesday, October 2, 2018

Almost 58 percent of downstream traffic on the internet is video

The Global Internet Phenomena Report provided by Sandvine - together with Cisco's Visual Networking Index (VNI) - was always a good source how internet traffic evolves over time, specifically in the context of streaming audio and video content (note: Nielsen's Law of Internet Bandwidth is also worth noting here as well as Bitmovin's Video Developer Survey). I used this report on many of my presentations to highlight the 'importance of multimedia delivery'. Thus, I'm happy to see that on October 2, 2018 Sandvine released a new version of its Global Internet Phenomena Report after a rather long break of two years.

The report is available here with some highlights reported below.

Almost 58% of downstream traffic on the internet is video and Netflix is 15% of the total downstream volume traffic across the entire internet.

The streaming video traffic share comes up with some regional differences (see figure below). Netflix dominates Americas video streaming (30.71%) whereas EMEA is lead by YouTube (30.39%) and APAC is not dominated by any streaming services but "HTTP media stream" (29.24%) in general. Overall, Netflix and YouTube together are still responsible for approx. 50% of the global video streaming traffic share.

More to come later...

Monday, September 24, 2018

2018 Video Developer Survey

Bitmovin's 2018 Video Developer Survey reveals interesting details about

  • streaming formats (MPEG-DASH, HLS, CMAF, etc.),
  • video codecs (AVC, HEVC, VP9, AV1),
  • audio codecs (AAC, MP3, Dolby, etc.),
  • encoding preferences (hardware, software on-premise/cloud),
  • players (open source, commercial, in-house solution),
  • DRM,
  • monetization model,
  • ad standard/technology, and
  • what are the biggest problems experienced today (DRM, ad-blocker, ads in general, server-side ad insertion, CDN issues, broadcast delay/latency, getting playback to run on all devices)
For example, the figure below illustrates the planned video codec usage in the next 12 months compared to the 2017 report.
Planned video codec usage in the next 12 months 2017 vs. 2018.
In total, 456 survey submissions from over 67 countries have been received and included into the report, which can be downloaded here for free.