Friday, March 14, 2025

MPEG news: a report from the 149th meeting

This blog post is based on the MPEG press release and has been modified/updated here to focus on and highlight research aspects. This version of the blog post will also be posted at ACM SIGMM Records.

MPEG News Archive

The 149th MPEG meeting took place in Geneva, Switzerland, from January 20 to 24, 2025. The official press release can be found here. MPEG promoted three standards (among others) to Final Draft International Standard (FDIS), driving innovation in next-generation, immersive audio and video coding, and adaptive streaming:

  • MPEG-I Immersive Audio enables realistic 3D audio with six degrees of freedom (6DoF).
  • MPEG Immersive Video (Second Edition) introduces advanced coding tools for volumetric video.
  • MPEG-DASH (Sixth Edition) enhances low-latency streaming, content steering, and interactive media.
This blog post focuses on these new standards/editions based on the press release and amended with research aspect relevant for the ACM SIGMM community.

MPEG-I Immersive Audio

At the 149th MPEG meeting, MPEG Audio Coding (WG 6) promoted ISO/IEC 23090-4 MPEG-I immersive audio to Final Draft International Standard (FDIS), marking a major milestone in the development of next-generation audio technology.

MPEG-I immersive audio is a groundbreaking standard designed for the compact and highly realistic representation of spatial sound. Tailored for Metaverse applications, including Virtual, Augmented, and Mixed Reality (VR/AR/MR), it enables seamless real-time rendering of interactive 3D audio with six degrees of freedom (6DoF). Users can not only turn their heads in any direction (pitch/yaw/roll) but also move freely through virtual environments (x/y/z), creating an unparalleled sense of immersion.

True to MPEG’s legacy, this standard is optimized for efficient distribution – even over networks with severe bitrate constraints. Unlike proprietary VR/AR audio solutions, MPEG-I Immersive Audio ensures broad interoperability, long-term stability, and suitability for both streaming and downloadable content. It also natively integrates MPEG-H 3D Audio for high-quality compression.

The standard models a wide range of real-world acoustic effects to enhance realism. It captures detailed sound source properties (e.g., level, point sources, extended sources, directivity characteristics, and Doppler effects) as well as complex environmental interactions (e.g., reflections, reverberation, diffraction, and both total and partial occlusion). Additionally, it supports diverse acoustic environments, including outdoor spaces, multiroom scenes with connecting portals, and areas with dynamic openings such as doors and windows. Its rendering engine balances computational efficiency with high-quality output, making it suitable for a variety of applications.

Further reinforcing its impact, the upcoming ISO/IEC 23090-34 Immersive audio reference software will fully implement MPEG-I immersive audio in a real-time framework. This interactive 6DoF experience will facilitate industry adoption and accelerate innovation in immersive audio. The reference software is expected to reach FDIS status by April 2025.

With MPEG-I immersive audio, MPEG continues to set the standard for the future of interactive and spatial audio, paving the way for more immersive digital experiences.

Research aspects: Research can focus on optimizing the streaming and compression of MPEG-I immersive audio for constrained networks, ensuring efficient delivery without compromising spatial accuracy. Another key area is improving real-time 6DoF audio rendering by balancing computational efficiency and perceptual realism, particularly in modeling complex acoustic effects like occlusions, reflections, and Doppler shifts for interactive VR/AR/MR applications.

MPEG Immersive Video (Second Edition)

At the 149th MPEG meeting, MPEG Video Coding (WG 4) advanced the second edition of ISO/IEC 23090-12 MPEG immersive video (MIV) to Final Draft International Standard (FDIS), marking a significant step forward in immersive video technology.

MIV enables the efficient compression, storage, and distribution of immersive video content, where multiple real or virtual cameras capture a 3D scene. Designed for next-generation applications, the standard supports playback with six degrees of freedom (6DoF), allowing users to not only change their viewing orientation (pitch/yaw/roll) but also move freely within the scene (x/y/z). By leveraging strong hardware support for widely used video formats, MPEG immersive video provides a highly flexible framework for multi-view video plus depth (MVD) and multi-plane image (MPI) video coding, making volumetric video more accessible and efficient.

With the second edition, MPEG continues to expand the capabilities of MPEG immersive video, introducing a range of new technologies to enhance coding efficiency and support more advanced immersive experiences. Key additions include:
  • Geometry coding using luma and chroma planes, improving depth representation
  • Capture device information, enabling better reconstruction of the original scene
  • Patch margins and background views, optimizing scene composition
  • Static background atlases, reducing redundant data for stationary elements
  • Support for decoder-side depth estimation, enhancing depth accuracy
  • Chroma dynamic range modification, improving color fidelity
  • Piecewise linear normalized disparity quantization and linear depth quantization, refining depth precision
The second edition also introduces two new profiles: (1) MIV Simple MPI profile, allowing MPI content playback with a single 2D video decoder, and (2) MIV 2 profile, a superset of existing profiles that incorporates all newly added tools.

With these advancements, MPEG immersive video continues to push the boundaries of immersive media, providing a robust and efficient solution for next-generation video applications.

Research aspects: Possible research may explore advancements in MPEG immersive video to improve compression efficiency and real-time streaming while preserving depth accuracy and spatial quality. Another key area is enhancing 6DoF video rendering by leveraging new coding tools like decoder-side depth estimation and geometry coding, enabling more precise scene reconstruction and seamless user interaction in volumetric video applications.

MPEG-DASH (Sixth Edition)

At the 149th MPEG meeting, MPEG Systems (WG 3) advanced the sixth edition of MPEG-DASH (ISO/IEC 23009-1 Media presentation description and segment formats) by promoting it to the Final Draft International Standard (FDIS), the final stage of standards development. This milestone underscores MPEG’s ongoing commitment to innovation and responsiveness to evolving market needs.

The sixth edition introduces several key enhancements to improve the flexibility and efficiency of MPEG-DASH:
  • Alternative media presentation support, enabling seamless switching between main and alternative streams
  • Content steering signaling across multiple CDNs, optimizing content delivery
  • Enhanced segment sequence addressing, improving low-latency streaming and faster tune-in
  • Compact duration signaling using patterns, reducing MPD overhead
  • Support for Common Media Client Data (CMCD), enabling better client-side analytics
  • Nonlinear playback for interactive storylines, expanding support for next-generation media experiences
With these advancements, MPEG-DASH continues to evolve as a robust and scalable solution for adaptive streaming, ensuring greater efficiency, flexibility, and enhanced user experiences across a wide range of applications.

Research aspects: While advancing MPEG-DASH for more efficient and flexible adaptive streaming has been subject to research for a while, optimizing content delivery across multiple CDNs while minimizing latency and optimizing QoE remains an open issue. Another key area is enhancing interactivity and user experiences by leveraging new features like nonlinear playback for interactive storylines and improved client-side analytics through Common Media Client Data (CMCD).

The 150th MPEG meeting will be held online from March 31 to April 04, 2025. Click here for more information about MPEG meetings and their developments.

No comments: