Tuesday, October 30, 2018

MPEG news: a report from the 124th meeting, Macau, China

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.

The MPEG press release comprises the following aspects:
  • Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage
  • Compressed Representation of Neural Networks - MPEG issues Call for Proposals
  • Low Complexity Video Coding Enhancements - MPEG issues Call for Proposals
  • New Video Coding Standard expected to have licensing terms timely available - MPEG issues Call for Proposals
  • Multi-Image Application Format (MIAF) promoted to Final Draft International Standard
  • 3DoF+ Draft Call for Proposal goes Public

Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage

At its 124th meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to Committee Draft (CD) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colour. By leveraging existing and video ecosystems in general (hardware acceleration, transmission services and infrastructure), and future video codecs as well, the V-PCC technology enables new applications. The current V-PCC encoder implementation provides a compression of 125:1, which means that a dynamic point cloud of 1 million points could be encoded at 8 Mbit/s with good perceptual quality.

A next step is the storage of V-PCC in ISOBMFF for which a working draft has been produced. It is expected that further details will be discussed in upcoming reports.
Research aspects: Video-based Point Cloud Compression (V-PCC) is at CD stage and a first working draft for the storage of V-PCC in ISOBMFF has been provided. Thus, a next consequence is the delivery of V-PCC encapsulated in ISOBMFF over networks utilizing various approaches, protocols, and tools. Additionally, one may think of using also different encapsulation formats if needed. I hope to see some of these aspects covered in future conferences including those -- but not limited to -- listed at the very end of this blog post.

MPEG issues Call for Proposals on Compressed Representation of Neural Networks

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, media coding, data analytics, and many other fields. Their recent success is based on the feasibility of processing much larger and complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. Some applications require the deployment of a particular trained network instance to a potentially large number of devices and, thus, could benefit from a standard for the compressed representation of neural networks. Therefore, MPEG has issued a Call for Proposals (CfP) for compression technology for neural networks, focusing on the compression of parameters and weights, focusing on four use cases: (i) visual object classification, (ii) audio classification, (iii) visual feature extraction (as used in MPEG CDVA), and (iv) video coding.
Research aspects: As point out last time, research here will mainly focus around compression efficiency for both lossy and lossless scenarios. Additionally, communication aspects such as transmission of compressed artificial neural networks within lossy, large-scale environments including update mechanisms may become relevant in the (near) future.

MPEG issues Call for Proposals on Low Complexity Video Coding Enhancements

Upon request from the industry, MPEG has identified an area of interest in which video technology deployed in the market (e.g., AVC, HEVC) can be enhanced in terms of video quality without the need to necessarily replace existing hardware. Therefore, MPEG has issued a Call for Proposals (CfP) on Low Complexity Video Coding Enhancements.

The objective is to develop video coding technology with a data stream structure defined by two component streams: a base stream decodable by a hardware decoder and an enhancement stream suitable for software processing implementation. The project is meant to be codec agnostic; in other words, the base encoder and base decoder can be AVC, HEVC, or any other codec in the market.
Research aspects: The interesting aspect here is that this use case assumes a legacy base decoder - most likely realized in hardware - which is enhanced with software-based implementations to improve coding efficiency or/and quality without sacrificing capabilities of the end user in terms of complexity and, thus, energy efficiency due to the software based solution.

MPEG issues Call for Proposals for a New Video Coding Standard expected to have licensing terms timely available

At its 124th meeting, MPEG issued a Call for Proposals (CfP) for a new video coding standard to address combinations of both technical and application (i.e., business) requirements that may not be adequately met by existing standards. The aim is to provide a standardized video compression solution which combines coding efficiency similar to that of HEVC with a level of complexity suitable for real-time encoding/decoding and the timely availability of licensing terms.
Research aspects: This new work item is more related to business aspects (i.e., licensing terms) than technical aspects of video coding. As this blog is about technical aspects and I'm also not an expert in licensing terms, I do not comment on this any further.

Multi-Image Application Format (MIAF) promoted to Final Draft International Standard

The Multi-Image Application Format (MIAF) defines interoperability points for creation, reading, parsing, and decoding of images embedded in High Efficiency Image File (HEIF) format by (i) only defining additional constraints on the HEIF format, (ii) limiting the supported encoding types to a set of specific profiles and levels, (iii) requiring specific metadata formats, and (iv) defining a set of brands for signaling such constraints including specific depth map and alpha plane formats. For instance, it addresses use case like a capturing device may use one of HEIF codecs with a specific HEVC profile and level in its created HEIF files, while a playback device is only capable of decoding the AVC bitstreams.
Research aspects: MIAF is an application format which is defined as a combination of tools (incl. profiles and levels) of other standards (e.g., audio codecs, video codecs, systems) to address the needs of a specific application. Thus, the research is related to use cases enabled by this application format.

3DoF+ Draft Call for Proposal goes Public

Following investigations on the coding of “three Degrees of Freedom plus” (3DoF+) content in the context of MPEG-I, the MPEG video subgroup has provided evidence demonstrating the capability to encode a 3DoF+ content efficiently while maintaining compatibility with legacy HEVC hardware. As a result, MPEG decided to issue a draft Call for Proposal (CfP) to the public containing the information necessary to prepare for the final Call for Proposal expected to occur at the 125th MPEG meeting (January 2019) with responses due at the 126th MPEG meeting (March 2019).
Research aspects: This work item is about video (coding) and, thus, research is about compression efficiency.

What else happened at #MPEG124?

  • MPEG-DASH 3rd edition is still in the final editing phase and not yet available. Last time, I wrote that we expect final publication later this year or early next year and we hope this is still the case. At this meeting Amendment.5 is progressed to DAM and conformance/reference software for SRD, SAND and Server Push is also promoted to DAM. In other words, DASH is pretty much in maintenance mode.
  • MPEG-I (systems part) is working on immersive media access and delivery and I guess more updates will come on this after the next meeting. OMAF is working on a 2nd edition for which a working draft exists and phase 2 use cases (public document) and draft requirements are discussed.
  • Versatile Video Coding (VVC): working draft 3 (WD3) and test model 3 (VTM3) has been issued at this meeting including a large number of new tools. Both documents (and software) will be publicly available after editing periods (Nov. 23 for WD3 and Dec 14 for VTM3). JVET documents are publicly available here http://phenix.it-sudparis.eu/jvet/.

Last but not least, some ads...

Tuesday, October 9, 2018

AAU and Bitmovin presenting IEEE ICIP 2018

The IEEE International Conference in Image Processing (ICIP) is with more than 1,000 attendees one of the biggest conferences of the Signal Processing Society. At ICIP'18Anatoliy (AAU) and myself (AAU/Bitmovin) attended with the following presentations:

On Monday, October 8, I was on the panel of the Young Professional Networking Event (together with Amy Reibman and Sheila Hemami) sharing my experiences with all attendees. See one picture here.

On Tuesday, October 9, I presented at the Innovation Program talking about "Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of the Art and Challenges Ahead".



On Wednesday, October 10, Anatoliy presented our joint AAU/Bitmovin paper about "A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services". Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audio- visual content accounts for the majority of today’s internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs. This paper provides a practical evaluation of state of the art video codecs (i.e., AV1, AVC/libx264, HEVC/libx265, VP9/libvpx-vp9) for large- scale HTTP adaptive streaming services. In anticipation of the results, AV1 shows promising performance compared to established video codecs. Additionally, AV1 is intended to be royalty free making it worthwhile to be considered for large scale HTTP adaptive streaming services.


A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services from Christian Timmerer

Acknowledgment: This work was supported in part by the Austrian Research Promotion Agency (FFG) under the Next Generation Video Streaming project “PROMETHEUS”.

Tuesday, October 2, 2018

Almost 58 percent of downstream traffic on the internet is video

The Global Internet Phenomena Report provided by Sandvine - together with Cisco's Visual Networking Index (VNI) - was always a good source how internet traffic evolves over time, specifically in the context of streaming audio and video content (note: Nielsen's Law of Internet Bandwidth is also worth noting here as well as Bitmovin's Video Developer Survey). I used this report on many of my presentations to highlight the 'importance of multimedia delivery'. Thus, I'm happy to see that on October 2, 2018 Sandvine released a new version of its Global Internet Phenomena Report after a rather long break of two years.

The report is available here with some highlights reported below.

Almost 58% of downstream traffic on the internet is video and Netflix is 15% of the total downstream volume traffic across the entire internet.

The streaming video traffic share comes up with some regional differences (see figure below). Netflix dominates Americas video streaming (30.71%) whereas EMEA is lead by YouTube (30.39%) and APAC is not dominated by any streaming services but "HTTP media stream" (29.24%) in general. Overall, Netflix and YouTube together are still responsible for approx. 50% of the global video streaming traffic share.


More to come later...