Saturday, August 1, 2020

MPEG news: a report from the 131st meeting (virtual)

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.


The 131st MPEG meeting concluded on July 3, 2020, online, again but with a press release comprising an impressive list of news items which is led by


MPEG Announces VVC – the Versatile Video Coding Standard


Just in the middle of the SC29 (i.e., MPEG’s parent body within ISO) restructuring process, MPEG successfully ratified -- jointly with ITU-T’s VCEG within JVET -- its next-generation video codec among other interesting results from the 131st MPEG meeting:


Standards progressing to final approval ballot (FDIS)

  • MPEG Announces VVC – the Versatile Video Coding Standard
  • Point Cloud Compression – MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage
  • MPEG-H 3D Audio – MPEG promotes Baseline Profile for 3D Audio to the final stage

Call for Proposals

  • Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
  • MPEG issues a Call for Proposals on extension and improvements to ISO/IEC 23092 standard series

Standards progressing to the first milestone of the ISO standard development process

  • Widening support for storage and delivery of MPEG-5 EVC
  • Multi-Image Application Format adds support of HDR
  • Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
  • MPEG Immersive Video (MIV) progresses to Committee Draft
  • Neural Network Compression for Multimedia Applications – MPEG progresses to Committee Draft
  • MPEG issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)

The corresponding press release of the 131st MPEG meeting can be found here: https://mpeg-standards.com/meetings/mpeg-131/. This report focused on video coding featuring VVC as well as PCC and systems aspects (i.e., file format, DASH).


MPEG Announces VVC – the Versatile Video Coding Standard


MPEG is pleased to announce the completion of the new Versatile Video Coding (VVC) standard at its 131st meeting. The document has been progressed to its final approval ballot as ISO/IEC 23090-3 and will also be known as H.266 in the ITU-T.


VVC Architecture (from IEEE ICME 2020 tutorial of Mathias Wien and Benjamin Bross)


VVC is the latest in a series of very successful standards for video coding that have been jointly developed with ITU-T, and it is the direct successor to the well-known and widely used High Efficiency Video Coding (HEVC) and Advanced Video Coding (AVC) standards (see architecture in the figure above). VVC provides a major benefit in compression over HEVC. Plans are underway to conduct a verification test with formal subjective testing to confirm that VVC achieves an estimated 50% bit rate reduction versus HEVC for equal subjective video quality. Test results have already demonstrated that VVC typically provides about a 40%-bit rate reduction for 4K/UHD video sequences in tests using objective metrics (i.e., PSNR, VMAF, MS-SSIM). Application areas especially targeted for the use of VVC include

  • ultra-high definition 4K and 8K video,
  • video with a high dynamic range and wide colour gamut, and
  • video for immersive media applications such as 360° omnidirectional video.

Furthermore, VVC is designed for a wide variety of types of video such as camera captured, computer-generated, and mixed content for screen sharing, adaptive streaming, game streaming, video with scrolling text, etc. Conventional standard-definition and high-definition video content are also supported with similar gains in compression. In addition to improving coding efficiency, VVC also provides highly flexible syntax supporting such use cases as (i) subpicture bitstream extraction, (ii) bitstream merging, (iii) temporal sublayering, and (iv) layered coding scalability.


The current performance of VVC compared to HEVC-HM is shown in the figure below which confirms the statement above but also highlights the increased complexity. Please note that VTM9 is not optimized for speed but functionality (i.e., compression efficiency).


Performance of VVC, VTM9 vs. HM (taken from https://bit.ly/mpeg131).


MPEG also announces completion of ISO/IEC 23002-7 “Versatile supplemental enhancement information for coded video bitstreams” (VSEI), developed jointly with ITU-T as Rec. ITU-T H.274. The new VSEI standard specifies the syntax and semantics of video usability information (VUI) parameters and supplemental enhancement information (SEI) messages for use with coded video bitstreams. VSEI is especially intended for use with VVC, although it is drafted to be generic and flexible so that it may also be used with other types of coded video bitstreams. Once specified in VSEI, different video coding standards and systems-environment specifications can re-use the same SEI messages without the need for defining special-purpose data customized to the specific usage context.


At the same time, the Media Coding Industry Forum (MC-IF) announces a VVC patent pool fostering with an initial meeting on September 1, 2020. The aim of this meeting is to identify tasks and to propose a schedule for VVC pool fostering with the goal to select a pool facilitator/administrator by the end of 2020. MC-IF is not facilitating or administering a patent pool.


At the time of writing this blog post, it is probably too early to make an assessment of whether VVC will share the fate of HEVC or AVC (w.r.t. patent pooling). AVC is still the most widely used video codec but with AVC, HEVC, EVC, VVC, LCEVC, AV1, (AV2), and probably also AVS3 -- did I miss anything? -- the competition and pressure are certainly increasing.


Research aspects: from a research perspective, reduction of time-complexity (for a variety of use cases) while maintaining quality and bitrate at acceptable levels is probably the most relevant aspect. Improvements in individual building blocks of VVC by using artificial neural networks (ANNs) are another area of interest but also end-to-end aspects of video coding using ANNs will probably pave the roads towards the/a next generation of video codec(s). Utilizing VVC and its features for HTTP adaptive streaming (HAS) is probably most interesting for me but maybe also for others...

MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage

At its 131st meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to the Final Draft International Standard (FDIS) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colors and reflectance. Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass-market applications. However, the relative ease to capture and render spatial information as point clouds compared to other volumetric video representations makes point clouds increasingly popular to present immersive volumetric data. With the current V-PCC encoder implementation providing compression in the range of 100:1 to 300:1, a dynamic point cloud of one million points could be encoded at 8 Mbit/s with good perceptual quality. Real-time decoding and rendering of V-PCC bitstreams have also been demonstrated on current mobile hardware.
The V-PCC standard leverages video compression technologies and the video ecosystem in general (hardware acceleration, transmission services, and infrastructure) while enabling new kinds of applications. The V-PCC standard contains several profiles that leverage existing AVC and HEVC implementations, which may make them suitable to run on existing and emerging platforms. The standard is also extensible to upcoming video specifications such as Versatile Video Coding (VVC) and Essential Video Coding (EVC).

The V-PCC standard is based on Visual Volumetric Video-based Coding (V3C), which is expected to be re-used by other MPEG-I volumetric codecs under development. MPEG is also developing a standard for the carriage of V-PCC and V3C data (ISO/IEC 23090-10) which has been promoted to DIS status at the 130th MPEG meeting.

By providing high-level immersiveness at currently available bandwidths, the V-PCC standard is expected to enable several types of applications and services such as six Degrees of Freedom (6 DoF) immersive media, virtual reality (VR) / augmented reality (AR), immersive real-time communication and cultural heritage.

Research aspects: as V-PCC is video-based, we can probably state similar research aspects as for video codecs such as improving efficiency both for encoding and rendering as well as reduction of time complexity. During the development of V-PCC mainly HEVC (and AVC) has/have been used but it is definitely interesting to use also VVC for PCC. Finally, the dynamic adaptive streaming of V-PCC data is still in its infancy despite some articles published here and there.

MPEG Systems related News

Finally, I’d like to share news related to MPEG systems and the carriage of video data as depicted in the figure below. In particular, the carriage of VVC (and also EVC) has been now enabled in MPEG-2 Systems (specifically within the transport stream) and in the various file formats (specifically within the NAL file format). The latter is used also in CMAF and DASH which makes VVC (and also EVC) ready for HTTP adaptive streaming (HAS).

Carriage of Video in MPEG Systems Standards (taken from https://bit.ly/mpeg131).

What about DASH and CMAF?

CMAF maintains a so-called "technologies under consideration" document which contains -- among other things -- a proposed VVC CMAF profile. Additionally, there are two exploration activities related to CMAF, i.e., (i) multi-stream support and (ii) storage, archiving, and content management for CMAF files.

DASH works on potential improvement for the first amendment to ISO/IEC 23009-1 4th edition related to CMAF support, events processing model, and other extensions. Additionally, there’s a working draft for a second amendment to ISO/IEC 23009-1 4th edition enabling bandwidth change signaling track and other enhancements. Furthermore, ISO/IEC 23009-8 (Session-based DASH operations) has been advanced to Draft International Standard (see also my last report).

An overview of the current status of MPEG-DASH can be found in the figure below.


The next meeting will be again an online meeting in October 2020.

Finally, MPEG organized a Webinar presenting results from the 131st MPEG meeting. The slides and video recordings are available here: https://bit.ly/mpeg131.

Click here for more information about MPEG meetings and their developments.

Thursday, July 16, 2020

MPEG131 Press Release (Index): WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard

WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard


The 131st WG 11 (MPEG) meeting was held online, 29 June – 3 July 2020

Table of Contents

Standards progressing to final approval ballot (FDIS)
Call for Proposals
Standards progressing to the first milestone of the ISO standard development process
Webinar: What’s new in MPEG?

MPEG cordially invites to its first webinar: What's new in MPEG? A brief update about the results of its 131st MPEG meeting featuring:
  • Welcome and Introduction: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
  • Versatile Video Coding (VVC): Jens-Rainer Ohm and Gary Sullivan, JVET Chairs
  • MPEG 3D Audio: Schuyler Quackenbusch, MPEG Audio Chair
  • Video-based Point Cloud Compression (V-PCC): Marius, Preda, MPEG 3DG Chair
  • MPEG Immersive Video (MIV): Bart Kroon, MPEG Video BoG Chair
  • Carriage of Versatile Video Coding (VVC) and Enhanced Video Coding (EVC): Young-Kwon Lim, MPEG Systems Chair
  • MPEG Roadmap: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
When: Tuesday, July 21, 2020, 10:00 UTC and 21:00 UTC (to accommodate different time zones)
How: Please register here https://bit.ly/mpeg131. Q&A via sli.do (https://app.sli.do/event/xpzpkhlm; event # 54597) starting from July 21, 2020.

How to contact WG 11 (MPEG) and Further Information

Journalists that wish to receive WG 11 (MPEG) Press Releases by email should contact Dr. Christian Timmerer at christian.timmerer@itec.uni-klu.ac.at or christian.timmerer@bitmovin.com or subscribe via https://lists.aau.at/mailman/listinfo/mpeg-pr. For timely updates follow us on Twitter (https://twitter.com/mpeggroup).

Future WG 11 (MPEG) meetings are planned as follows: 
  • No. 132, Online, 12 – 16 October 2020
  • No. 133, Cape Town, ZA, 11 – 15 January 2021
  • No. 134, Geneva, CH, 26 – 30 April 2021
  • No. 135, Prague, CZ, 12 – 16 July 2021
For further information about WG 11 (MPEG), please contact:

Prof. Dr.-Ing. Jörn Ostermann (Convenor of WG 11 (MPEG), Germany)
Leibniz Universität Hannover
Appelstr. 9A
30167 Hannover, Germany
Tel: ++49 511 762 5316
Fax: ++49 511 762 5333

or

Priv.-Doz. Dr. Christian Timmerer
Alpen-Adria-Universität Klagenfurt | Bitmovin Inc.
9020 Klagenfurt am Wörthersee, Austria, Europe
Tel: +43 463 2700 3621

MPEG131 Press Release: WG11 (MPEG) issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)

MPEG131 Press Release: Index

WG11 (MPEG) issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)

At its 131st meeting, WG11 (MPEG) promoted the specification of the Conformance and Reference Software for Essential Video Coding (ISO/IEC 23094-4) to Committee Draft (CD) level. The Essential Video Coding (EVC) standard (ISO/IEC 23094-1) provides an improved compression capability over existing video coding standards with the timely publication of licensing terms. The issued specification of the Conformance and Reference Software for Essential Video Coding includes conformance bitstreams as well as a reference software for the generation of those conformance bitstreams. This important standard will greatly help the industry achieve effective interoperability between products using EVC and provide valuable information to ease the development of such products. The final specification is expected to be available in early 2021.

MPEG131 Press Release: Neural Network Compression for Multimedia Applications – WG11 (MPEG) progresses to Committee Draft

MPEG131 Press Release: Index

Neural Network Compression for Multimedia Applications – WG11 (MPEG) progresses to Committee Draft

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors or image and video coding. The trained neural networks for these applications contain a large number of parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to a number of clients using them in applications (e.g., mobile phones, smart cameras) requires a compressed representation of neural networks.

WG11 (MPEG) has completed the CD of the specification at its 131st meeting. Considering the fact that the compression of neural networks is likely to have a hardware-dependent and hardware-independent component, the standard is designed as a toolbox of compression technologies. The specification contains different parameter sparsification, parameter reduction (e.g., matrix decomposition), parameter quantization, and entropy coding methods, that can be assembled to encoding pipelines combining one or more (in the case of sparsification/reduction) methods from each group. The results show that trained neural networks for many common multimedia problems such as image or audio classification or image compression can be compressed to 10% of their original size with no or very small performance loss, and even significantly more at small performance loss. The specification is independent of a particular neural network exchange format, and interoperability with common formats is described in the annexes.

MPEG131 Press Release: MPEG Immersive Video (MIV) progresses to Committee Draft

MPEG131 Press Release: Index

MPEG Immersive Video (MIV) progresses to Committee Draft

At the 131st MPEG meeting, it was decided to output the committee draft of ISO/IEC 23090-12 MPEG Immersive Video. The name was changed from “Immersive Video” to “MPEG Immersive Video” (MIV), to clearly differentiate from other uses of the term “Immersive Video” in general parlance. MIV supports compression of immersive video content, in which a real or virtual 3D scene is captured by multiple real or virtual cameras. The use of this standard enables the storage and distribution of immersive video content over existing and future networks, for playback with 6 degrees of freedom of view position and orientation.

MPEG131 Press Release: Carriage of Geometry-based Point Cloud Data progresses to Committee Draft

MPEG131 Press Release: Index

Carriage of Geometry-based Point Cloud Data progresses to Committee Draft

At its 131st meeting, WG11 (MPEG) has promoted the carriage of Geometry-based point cloud data (ISO/IEC 23090-18) to the Committee Draft stage, the first milestone of ISO standard development process. This standard is the second standard introducing the support of volumetric media in the industry-famous ISO base media file format (ISOBMFF) family of standards after the standard on the carriage of video-based point cloud data (ISO/IEC 23090-10). This standard (i.e., ISO/IEC 23090-18) supports the carriage of point cloud data within multiple file format tracks in order to support individual access of each attributes comprising a single point cloud. Additionally, it also allows the carriage of point cloud data in one file format track for simple applications. Understanding the point cloud data could cover large geographical area and the size of the data could be massive in some application the standard support 3D region-based partial access of the data stored in the file so that the application can efficiently access the portion of data required to be processed. It is currently expected that the standard will reach its final milestone by mid-2021.

MPEG131 Press Release: Multi-Image Application Format adds support of HDR

MPEG131 Press Release: Index

Multi-Image Application Format adds support of HDR

Within less than two years after it has reached its last milestone of standard developments the Multi-Image Application Format (MIAF; ISO/IEC 23000-22) has become the default format for the storage of still pictures within the smartphones. However, it lacks support of one of the killer features for image quality enhancement, i.e., High Dynamic Range (HDR). To quickly answer such market needs, WG11 (MPEG) has promoted the 2nd Amendment to the Multi-Image Application Format, MIAF HEVC Advanced HDR profile, and other clarifications, its first milestone of ISO standard development process. This amendment adds support of the use of PQ (Perceptual Quantizer) and HLG (Hybrid Log Gamma) color transfer characteristics and P3 mastering display color volume properties with D65 white point for HEVC encoded still pictures to support widely used HDR technologies. It is currently expected that the standard will reach its final milestone by mid-2021.