Wednesday, July 8, 2020

What's new in MPEG?

What's new in MPEG?


What: A webinar about what's new in MPEG? (after the 131st MPEG meeting)
When: Tuesday, July 21, 2020, 10:00 UTC and 21:00 UTC, to accommodate different time zones (What is UTC?)
How: Please register here for 10:00 UTC and 21:00 UTC. Q&A via sli.do (event # 54597) starting from July 21.

Topics and Presenters:
  • Welcome and Introduction: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
  • Versatile Video Coding (VVC): Jens-Rainer Ohm and Gary Sullivan, JVET Chairs
  • MPEG 3D Audio: Schuyler Quackenbusch, MPEG Audio Chair
  • Video-based Point Cloud Compression (V-PCC): Marius, Preda, MPEG 3DG Chair
  • MPEG Immersive Video (MIV): Bart Kroon, MPEG Video BoG Chair
  • Carriage of Versatile Video Coding (VVC) and Enhanced Video Coding (EVC): Young-Kwon Lim, MPEG Systems Chair
  • MPEG Roadmap: Jörn Ostermann, Acting Convenor of WG11 (MPEG)


Thursday, June 11, 2020

DASH-IF awarded Excellence in DASH awards at ACM MMSys 2020

The DASH Industry Forum Excellence in DASH Awards at ACM MMSys 2020 acknowledges papers substantially addressing MPEG-DASH as the presentation format and are selected for presentation at ACM MMSys 2020. Preference is given to practical enhancements and developments which can sustain future commercial usefulness of DASH. The DASH format used should conform to the DASH-IF Interoperability Points as defined by http://dashif.org/guidelines/. It is a financial prize as follows: first place – €1000; second place – €500; and third place – €250. The winners are chosen by a DASH Industry Forum appointed committee and results are final.

This year's award goes to the following papers (Two first places, and one third):

1. Susanna Schwarzmann, Nick Hainke, Thomas Zinner, Christian Sieber, Werner Robitza, Alexander Raake. Comparing Fixed and Variable Segment Durations for Adaptive Video Streaming – A Holistic Analysis

1. Tomasz Lyko, Matthew Broadbent, Nicholas Race, Mike Nilsson, Paul Farrow, Steve Appleby. Evaluation of CMAF in Live Streaming Scenarios

3. Nan Jiang, Yao Liu, Tian Guo, Wenyao Xu, Viswanathan Swaminathan, Lisong Xu, Sheng Wei. 

The DASH-IF would like to congratulate all winners and hope to see you next year at ACM MMSys 2021.

Wednesday, May 27, 2020

MPEG news: a report from the 130th meeting, Alpbach, Austria (virtual)

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.

The 130th MPEG meeting concluded on April 24, 2020, in Alpbach, Austria ... well, not exactly, unfortunately. The 130th MPEG meeting concluded on April 24, 2020, but not in Alpbach, Austria.

I attended the 130th MPEG meeting remotely.
Because of the Covid-19 pandemic, the 130th MPEG meeting has been converted from a physical meeting to a fully online meeting, the first in MPEG’s 30+ years of history. Approximately 600 experts attending from 19 time zones worked in tens of Zoom meeting sessions supported by an online calendar and by collaborative tools that involved MPEG experts in both online and offline sessions. For example, input contributions had to be registered and uploaded ahead of the meeting to allow for efficient scheduling of two-hour meeting slots, which have been distributed from early morning to late night in order to accommodate experts working in different time zones as mentioned earlier. These input contributions have been then mapped to GitLab issues for offline discussions and the actual meeting slots have been primarily used for organizing the meeting, resolving conflicts, and making decisions including approving output documents. Although the productivity of the online meeting could not reach the level of regular face-to-face meetings, the results posted in the press release show that MPEG experts managed the challenge quite well, specifically
  • MPEG ratifies MPEG-5 Essential Video Coding (EVC) standard;
  • MPEG issues the Final Draft International Standards for parts 1, 2, 4, and 5 of MPEG-G 2nd edition;
  • MPEG expands the coverage of ISO Base Media File Format (ISOBMFF) family of standards;
  • A new standard for large scale client-specific streaming with MPEG-DASH;
Other Important Activities at the 130th MPEG meeting: (i) the carriage of visual volumetric video-based coding data, (ii) Network-Based Media Processing (NBMP) function templates, (iii) the conversion from MPEG-21 contracts to smart contracts, (iv) deep neural network based video coding, (v) Low Complexity Enhancement Video Coding (LCEVC) reaching DIS stage, and (vi) a new level of the MPEG-4 Audio ALS Simple Profile for high-resolution audio among others

The corresponding press release of the 130th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/130. This report focused on video coding (EVC) and systems aspects (file format, DASH).

MPEG ratifies MPEG-5 Essential Video Coding Standard

At its 130th meeting, MPEG announced the completion of the new ISO/IEC 23094-1 standard which is referred to as MPEG-5 Essential Video Coding (EVC) and has been promoted to Final Draft International Standard (FDIS) status. There is a constant demand for more efficient video coding technologies (e.g., due to the increased usage of video on the internet), but coding efficiency is not the only factor determining the industry's choice of video coding technology for products and services. The EVC standard offers improved compression efficiency compared to existing video coding standards and is based on the statements of all contributors to the standard who have committed announcing their license terms for the MPEG-5 EVC standard no later than two years after the FDIS publication date.

The MPEG-5 EVC defines two important profiles, including "Baseline profile" and "Main profile". The "Baseline profile" contains only technologies that are older than 20 years or otherwise freely available for use in the standard. In addition, the "Main profile" adds a small number of additional tools, each of which can be either cleanly disabled or switched to the corresponding baseline tool on an individual basis.

It will be interesting to see how EVC profiles (baseline and main) will find its path into products and services given the existing number of codecs already in use (e.g., AVC, HEVC, VP9, AV1) and those still under development but being close to ratification (e.g., VVC, LCEVC). That is, in total we, may end up with about seven video coding formats that probably need to be considered for future video products and services. In other words, the multi-codec scenario I have envisioned some time ago is becoming reality raising some interesting challenges to be addressed in the future.

Research aspects: as for all video coding standards, the most important research aspect is certainly coding efficiency. For EVC it might be also interesting to research its usability of the built-in tool switching mechanism within a practical setup. Furthermore, the multi-codec issue, the ratification of EVC adds another facet to the already existing video coding standards in use or/and under development.

MPEG expands the Coverage of ISO Base Media File Format (ISOBMFF) Family of Standards

At the 130th WG11 (MPEG) meeting, the ISOBMFF family of standards has been significantly amended with new tools and functionalities. The standards in question are as follows:
  • ISO/IEC 14496-12: ISO Base Media File Format;
  • ISO/IEC 14496-15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format;
  • ISO/IEC 23008-12: Image File Format; and
  • ISO /IEC 23001-16: Derived visual tracks in the ISO base media file format.
In particular, three new amendments to the ISOBMFF family have reached their final milestone, i.e., Final Draft Amendment (FDAM):
  1. Amendment 4 to ISO/IEC 14496-12 (ISO Base Media File Format) allows the use of a more compact version of metadata for movie fragments;
  2. Amendment 1 to ISO/IEC 14496-15 (Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format) adds support of HEVC slice segment data track and additional extractor types for HEVC such as track reference and track groups; and
  3. Amendment 2 to ISO/IEC 23008-12 (Image File Format) adds support for more advanced features related to the storage of short image sequences such as burst and bracketing shots.
At the same time, new amendments have reached their first milestone, i.e., Committee Draft Amendment (CDAM):
  1. Amendment 2 to ISO/IEC 14496-15 (Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format) extends its scope to newly developed video coding standards such as Essential Video Coding (EVC) and Versatile Video Coding (VVC); and
  2. the first edition of ISO/IEC 23001-16 (Derived visual tracks in the ISO base media file format) allows a new type of visual track whose content can be dynamically generated at the time of presentation by applying some operations to the content in other tracks, such as crossfading over two tracks.
Both are expected to reach their final milestone in mid-2021.

Finally, the final text for the ISO/IEC 14496-12 6th edition Final Draft International Standard (FDIS) is now ready for the ballot after converting MP4RA to the Maintenance Agency. WG11 (MPEG) notes that Apple Inc. has been appointed as the Maintenance Agency and MPEG appreciates its valuable efforts for the many years while already acting as the official registration authority for the ISOBMFF family of standards, i.e., MP4RA (https://mp4ra.org/). The 6th edition of ISO/IEC 14496-12 is expected to be published by ISO by the end of this year.

Research aspects: the ISOBMFF family of standards basically offers certain tools and functionalities to satisfy the given use case requirements. The task of the multimedia systems research community could be to scientifically validate these tools and functionalities with respect to the use cases and maybe even beyond, e.g., try to adopt these tools and functionalities for novel applications and services.

A New Standard for Large Scale Client-specific Streaming with DASH

Historically, in ISO/IEC 23009 (Dynamic Adaptive Streaming over HTTP; DASH), every client has used the same Media Presentation Description (MPD) as it best serves the scalability of the service (e.g., for efficient cache efficiency in content delivery networks). However, there have been increasing requests from the industry to enable customized manifests for more personalized services. Consequently, MPEG has studied a solution to this problem without sacrificing scalability, and it has reached the first milestone of its standardization at the 130th MPEG meeting.

ISO/IEC 23009-8 adds a mechanism to the Media Presentation Description (MPD) to refer to another document, called Session-based Description (SBD), which allows per-session information. The DASH client can use this information (i.e., variables and their values) provided in the SBD to derive the URLs for HTTP GET requests. This standard is expected to reach its final milestone in mid-2021.

Research aspects: SBD's goal is to enable personalization while maintaining scalability which calls for a tradeoff, i.e., which kind of information to put into the MPD and what should be conveyed within the SBD. This tradeoff per se could be considered already a research question that will be hopefully addressed in the near future.

An overview of the current status of MPEG-DASH can be found in the figure below.
The next MPEG meeting will be from June 29th to July 3rd and will be again an online meeting. I am looking forward to a productive AhG period and an online meeting later this year. I am sure that MPEG will further improve its online meeting capabilities and can certainly become a role model for other groups within ISO/IEC and probably also beyond.

Friday, April 10, 2020

Streaming video consumption is increasing dramatically

Streaming video consumption is increasing dramatically which calls for optimization! See the infographic below and PDF here.


Thursday, March 26, 2020

DVB World Online: Joint DVB/DASH-IF Webinar

March 31, 2020

DASH-IF is proud and delighted to be part of a series of public and free webinars organized by Digital Video Broadcasting (DVB) project. The webinars replace DVB's annual flagship conference DVB World in March 2020 that was cancelled due to the COVID-19 situation.
On March 31, 2020 at 4pm cest, the webinar DASH: from on-demand to large scale live for premium services will take place. DASH-IF has organized a 90min session on the latest technology and deployment advances of Dynamic Adaptive Streaming over HTTP (DASH).
Registration for the webinar is public and free of charge. Please register here.
The following program is planned:
  • 16:00-16:10 (10 mins) The DASH Industry Forum, who we are and what we do – Thomas Stockhammer (DASH-IF Interop WG Chair, Qualcomm)
  • 16:10-16:25 (15 mins) Meeting live broadcast requirements – the latest on DASH low latency! – Will Law (DASH-IF Leadership Award Winner, Akamai)
  • 16:25-16:40 (15 mins) Ad insertion in live content – pre, mid and post rolling – Zachary Cava (Hulu)
  • 16:40-16:50 (10 mins) Bandwidth prediction for multi-bitrate streaming at low latency – Ali Begen (Comcast)
  • 16:50-17:05 (15 mins) Implementing DASH low latency in FFmpeg – Jean-Baptiste Kempf (VideoLAN)
  • 17:05-17:15 (10 mins) Managing multi-DRM with DASH – Laurent Piron (DASH-IF Content Protection and Security TF Chair, NAGRA)
  • 17:15-17:30 (15 mins) Q&A panel – Per Fröjdh (DASH-IF Promotion WG Chair, Ericsson) moderator

Saturday, February 22, 2020

MPEG news: a report from the 129th meeting, Brussels, Belgium

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.

The 129th MPEG meeting concluded on January 17, 2020 in Brussels, Belgium with the following topics:
  • Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage
  • Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media
  • Genomic information representation – WG11 receives answers to the joint call for proposals on genomic annotations in conjunction with ISO TC 276/WG 5
  • Open font format – WG11 promotes Amendment of Open Font Format to the final stage
  • High efficiency coding and media delivery in heterogeneous environments – WG11 progresses Baseline Profile for MPEG-H 3D Audio
  • Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage
Additional Important Activities at the 129th WG 11 (MPEG) meeting
The 129th WG 11 (MPEG) meeting was attended by more than 500 experts from 25 countries working on important activities including (i) a scene description for MPEG media, (ii) the integration of Video-based Point Cloud Compression (V-PCC) and Immersive Video (MIV), (iii) Video Coding for Machines (VCM), and (iv) a draft call for proposals for MPEG-I Audio among others.

The corresponding press release of the 129th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/129. This report focused on network-based media processing (NBMP), architectures of immersive media, compact descriptors for video analysis (CDVA), and an update about adaptive streaming formats (i.e., DASH and CMAF).

MPEG picture at friday plenary; © Rob Koenen (Tiledmedia).


Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage

At its 129th meeting, MPEG promoted ISO/IEC 23090-8, Network-Based Media Processing (NBMP), to Final Draft International Standard (FDIS). The FDIS stage is the final vote before a document is officially adopted as an International Standard (IS). During the FDIS vote, publications and national bodies are only allowed to place a Yes/No vote and are no longer able to make any technical changes. However, project editors are able to fix typos and make other necessary editorial improvements.

What is NBMP? The NBMP standard defines a framework that allows content and service providers to describe, deploy, and control media processing for their content in the cloud by using libraries of pre-built 3rd party functions. The framework includes an abstraction layer to be deployed on top of existing commercial cloud platforms and is designed to be able to be integrated with 5G core and edge computing. The NBMP workflow manager is another essential part of the framework enabling the composition of multiple media processing tasks to process incoming media and metadata from a media source and to produce processed media streams and metadata that are ready for distribution to media sinks.

Why NBMP? With the increasing complexity and sophistication of media services and the incurred media processing, offloading complex media processing operations to the cloud/network is becoming critically important in order to keep receiver hardware simple and power consumption low.

Research aspects: NBMP reminds me a bit about what has been done in the past in MPEG-21, specifically Digital Item Adaptation (DIA) and Digital Item Processing (DIP). The main difference is that MPEG now targets APIs rather than pure metadata formats, which is a step forward in the right direction as APIs can be implemented and used right away. NBMP will be particularly interesting in the context of new networking approaches including, but not limited to, software-defined networking (SDN), information-centric networking (ICN), mobile edge computing (MEC), fog computing, and related aspects in the context of 5G.

Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media

At its 129th meeting, WG11 (MPEG) published an updated version of its technical report on architectures for immersive media. This technical report, which is the first part of the ISO/IEC 23090 (MPEG-I) suite of standards, introduces the different phases of MPEG-I standardization and gives an overview of the parts of the MPEG-I suite. It also documents use cases and defines architectural views on the compression and coded representation of elements of immersive experiences. Furthermore, it describes the coded representation of immersive media and the delivery of a full, individualized immersive media experience. MPEG-I enables scalable and efficient individual delivery as well as mass distribution while adjusting to the rendering capabilities of consumption devices. Finally, this technical report breaks down the elements that contribute to a fully immersive media experience and assigns quality requirements as well as quality and design objectives for those elements.

Research aspects: This technical report provides a kind of reference architecture for immersive media, which may help identify research areas and research questions to be addressed in this context.

Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage

Managing and organizing the quickly increasing volume of video content is a challenge for many industry sectors, such as media and entertainment or surveillance. One example task is scalable instance search, i.e., finding content containing a specific object instance or location in a very large video database. This requires video descriptors that can be efficiently extracted, stored, and matched. Standardization enables extracting interoperable descriptors on different devices and using software from different providers so that only the compact descriptors instead of the much larger source videos can be exchanged for matching or querying. ISO/IEC 15938-15:2019 – the MPEG Compact Descriptors for Video Analysis (CDVA) standard – defines such descriptors. CDVA includes highly efficient descriptor components using features resulting from a Deep Neural Network (DNN) and uses predictive coding over video segments. The standard is being adopted by the industry. At its 129th meeting, WG11 (MPEG) has finalized the conformance guidelines and reference software. The software provides the functionality to extract, match, and index CDVA descriptors. For easy deployment, the reference software is also provided as Docker containers.

Research aspects: The availability of reference software helps to conduct reproducible research (i.e., reference software is typically publicly available for free) and the Docker container even further contributes to this aspect.

DASH and CMAF

The 4th edition of DASH has already been published and is available as ISO/IEC 23009-1:2019. Similar to previous iterations, MPEG’s goal was to make the newest edition of DASH publicly available for free, with the goal of industry-wide adoption and adaptation. During the most recent MPEG meeting, we worked towards implementing the first amendment which will include additional (i) CMAF support and (ii) event processing models with minor updates; these amendments are currently in draft and will be finalized at the 130th MPEG meeting in Alpbach, Austria. An overview of all DASH standards and updates are depicted in the figure below:

ISO/IEC 23009-8 or “session-based DASH operations” is the newest variation of MPEG-DASH. The goal of this part of DASH is to allow customization during certain times of a DASH session while maintaining the underlying media presentation description (MPD) for all other sessions. Thus, MPDs should be cacheable within content distribution networks (CDNs) while additional information should be customizable on a per session basis within a newly added session-based description (SBD). It is understood that the SBD should have an efficient representation to avoid file size issues and it should not duplicate information typically found in the MPD.

The 2nd edition of the CMAF standard (ISO/IEC 23000-19) will be available soon (currently under FDIS ballot) and MPEG is currently reviewing additional tools in the so-called ‘technologies under considerations’ document. Therefore, amendments were drafted for additional HEVC media profiles and exploration activities on the storage and archiving of CMAF contents.

The next meeting will bring MPEG back to Austria (for the 4th time) and will be hosted in Alpbach, Tyrol. For more information about the upcoming 130th MPEG meeting click here.

Click here for more information about MPEG meetings and their developments

Friday, February 21, 2020

2020 Mobile Internet Phenomena Report: more than 65 percent is Video

Source: Sandvine
In September, I blogged about the 2019 Global Internet Phenomena Report which revealed that video was 60% of the overall downstream traffic on the internet. The 2020 Mobile Internet Phenomena Report [Sandvine] shows that mobile video downstream traffic accounts for more than 65%; much more than any other application category as shown in the figure on the right, followed by social networking traffic and messaging. One year ago (see my blog post from Feb 2019), this was around 42% which means an increase of 23 percentage points. Thus, video growth is much higher for mobile networks than in general.

From the global mobile application traffic share, YouTube still maintains its number one position with more than 27% downstream traffic (vs. 37% in 2019) followed by Facebook Video, Instagram, Facebook, and Netflix "only" in 5th position (but numbers almost doubled compared to 2019).

The report also includes a spotlight regarding 5G and fixed mobile replacement (i.e., home boxes with connection to the mobile internet). Fixed-mobile network traffic share is led by Netflix with more than 25% followed by YouTube, Hulu, Disney+, and Amazon Prime -- all HTTP adaptive streaming services with together more than 58% of the traffic share. However, it seems that early 5G networks will mainly act as fixed-line replacements, similar to 4G networks in Europe due to unlimited data plans.

Another spotlight is related to Quality of Experience (QoE) and here I found something interesting, namely: "video is actually the simplest application to deliver good QoE to – give it a good downstream throughput, and it generally works fine. Keep the buffers on the device full, and the user won’t see any momentary network congestion or delay". I'd like to read your comments on this... 😎

Overall, video growth for mobile is much higher than for fixed networks and now mobile networks have surpassed fixed networks and I am wondering whether this growth will continue in the years to come. Remember, forecasts predict video will account for like 80% within the next few years...