Monday, February 9, 2009

MPEG's Vision on 3D Video Coding

The following text comprises the 'Vision on 3D Video Coding' approved by MPEG's Video and Requirements Sub-groups with the full reference: ISO/IEC JTC1/SC29/WG11/N10357, Lausanne, Switzerland, February 2009. Also publicly available here.

MPEG has developed a suite of international standards to support 3D services and devices, and now initiates a new phase of standardization to be completed within the next two years.
  • One objective is to enable stereo devices to cope with varying display types and sizes, and different viewing preferences. This includes the ability to vary the baseline distance for stereo video to adjust the depth perception, which could help to avoid fatigue and other viewing discomforts.
  • MPEG also envisions that high-quality auto-stereoscopic displays will enter the consumer market in the next few years. Since it is difficult to directly provide all the necessary views due to production and transmission constraints, a new format is needed to enable the generation of many high-quality views from a limited amount of input data, e.g. stereo and depth.
Our vision is a new 3D Video (3DV) format that goes beyond the capabilities of existing standards to enable both advanced stereoscopic display processing and improved support for auto-stereoscopic N-view displays, while enabling interoperable 3D services. This is illustrated in Figure 1 and further details are described below.
Figure 1. Target of 3D Video format illustrating limited camera inputs and constrained rate transmission according to a distribution environment. The 3DV data format aims to be capable of rendering a large number of output views for auto-stereoscopic N-view displays and support advanced stereoscopic processing.

Due to limitations in the production environment, the 3DV data format is assumed to be based on limited camera inputs; stereo content is most likely, but more views might also be available. In order to support a wide range of auto-stereoscopic displays, it should be possible for a large number of views to be generated from this data format. Additionally, the rate required for transmitting the 3DV format should be fixed to the distribution constraints, i.e., there should not be an increase in the rate simply because the display requires a higher number of views to cover a larger viewing angle. In this way, the transmission rate and the number of output views are decoupled. Advanced stereoscopic processing that requires view generation at the display would also be supported by this format.
Compared to the existing coding formats, the 3DV format has several advantages in terms of bit rate and 3D rendering capabilities, which is also illustrated in Figure 2.
  • 2D+Depth, as specified by ISO/IEC 23002-3 (and also referred to as MPEG-C Part 3), supports the inclusion of depth for generation of an increased number of views. While it has the advantage of being backward compatible with legacy devices and is agnostic of coding formats, it is only capable of rendering a limited depth range since it does not directly handle occlusions. The 3DV format expects to enhance the 3D rendering capabilities beyond this format.
  • Multiview Video Coding (MVC), as specified by ISO/IEC 14496-10 | ITU-T Recommendation H.264, supports the direct coding of multiple views and exploits inter-camera redundancy to reduce the bit rate. Although MVC is more efficient than simulcast, the rate of MVC encoded video is proportional to the number of views. The 3DV format expects to significantly reduce the bit rate needed to generate the required views at the receiver.
Figure 2. Illustration of 3D rendering capability versus bit rate for different formats, where 3D Video aims to improve rendering capability of 2D+Depth format while reducing bit rate requirements relative to simulcast and MVC.
Post a Comment