Monday, February 9, 2009

Vision and Requirements for High-Performance Video Coding (HVC)

The following text comprises the 'Vision and Requirements for High-Performance Video Coding (HVC)' approved by MPEG's Video and Requirements Sub-groups with the full reference: ISO/IEC JTC1/SC29/WG11/N10361, Lausanne, Switzerland, February 2009. Also publicly available here.

Introduction - Vision

This document presents the vision, goals and requirements for the High-Performance Video Coding (HVC) (note: this name is tentative) coding standard. This document will evolve in the next few months as the project takes shape and also as a result of input from potential users.

A large quantity of video material is already distributed in digital over broadcast channels, digital networks and packaged media. More and more of this material will be distributed with increased resolution and quality demand.

Technology evolution will soon make possible the capture and display of video material with a quantum leap in quality (temporal and spatial resolution, color fidelity, amplitude resolution). Networks are already finding it difficult to carry HDTV resolution and data rates economically to the end user. Therefore, further data rate increase will put additional pressure on the networks. For example:
  • High-definition (HD) devices (displays and cameras) are affordable for consumer usage today, while the currently available internet and broadcast network capacity is not sufficient to transfer large amount of HD content economically. While this situation may change slowly over time, the next generation of ultra-HD (UHD) contents and devices, such as 4Kx2K displays for home cinema applications and digital cameras, are already appearing on the horizon.
  • For mobile terminals, video quality using resolutions such as QCIF at low frame rates and low bit rates today is largely unacceptable. While the overall data rate will increase with the evolution of 3G/LTE and 4G networks, also the number of users increases simultaneously with their quality demand. Anticipating that lightweight HD resolutions such as 720p or even beyond will be introduced in the mobile sector to provide similar perceptual quality as for the home applications, lack of sufficient data rates as well as the prices to be paid for transmission will remain a problem for the long term.
MPEG has concluded that video bitrate (when current compression technology is used) will go up faster than the network infrastructure will be able to carry economically, both for wireless and wired networks. Therefore a new generation of video compression technology that has sufficiently higher compression capability than the existing AVC standard in its best configuration (the High Profile), is needed. A study has been started on the feasibility of HVC, which is mainly intended for high quality applications, in particular expecting
  • Performance improvements in terms of coding efficiency at higher resolution,
  • Applicability to entertainment-quality services such as HD mobile, home cinema and Ultra High Definition (UHD) TV.
MPEG plans to develop HVC following its general philosophy and goals [1]:
  • A requirement to be supported by HVC should refer to at least one specific application
  • Minimum number of conformance points defined in terms of profiles and levels
  • Definition of a new HVC standard that is not backwards compatible with an existing MPEG standard should be justified by sufficient performance gain
  • Complexity scalability in encoder and decoder: (1) Possible asymmetry of encoder and decoder processing complexity; (2)Full specification of decoding, preferably with no mismatch; (3)Scalability between amount of encoder processing and achievable quality
Requirements

Compression Performance

A substantially greater bitrate reduction over MPEG-4 AVC High Profile is required for the target application(s); at no point of the entire bitrate range shall HVC be worse than existing standard(s).

Subjective visually lossless compression shall be supported.

Picture Formats

The HVC development effort will focus on a set of rectangular picture formats that will include all commonly used picture formats, ranging at least from VGA to 4Kx2K, and potentially extending to QVGA and 8Kx4K.

Picture formats of arbitrary size shall also be supported, within limits specific to each Level. The HVC codec shall support at least the same range of picture formats supported by the MPEG-4 AVC syntax.

Color Spaces and Color Sampling

a) The YCbCr color space 4:2:0, 8 bits per component shall be supported
b) YCbCr/RGB 4:4:4 should be supported.
c) Higher bit depth up to 14 bits per component should be supported

Note: Further information is needed on whether support for 4:2:2 color space is needed for progressively scanned video due to digital video standards used in studios. Same for wide gamut color.

Frame Rates

In general typical frame rates of 24 to 60 fps shall be supported.
The HVC codec shall be at least as flexible in terms of frame rate as the AVC syntax which supports up to 172 frames per second (fps).

Scanning Methods

Support for progressive scanning shall be required for all Profiles and Levels.

Complexity

HVC complexity shall allow for feasible implementation within the constraints of the available technology at the expected time of usage.

HVC should be capable of trading-off complexity and compression efficiency, by having
  • an operating point with significant decrease in complexity compared to AVC but with better compression efficiency than AVC,
  • an operating point with increase complexity and commensurate increase in compression performance.

Parallel processing should be possible.

Note: Complexity includes: Power consumption, computational power, memory bandwidth etc.

Low Delay

HVC should be capable of operating in a low delay mode.

Note: HVC should be capable of trading-off complexity, compression efficiency and delay.

Random Access and “Trick Modes”

The standard shall support random access to certain positions of a stored video stream, and allow fast channel switching in the case of multi-channel services.

Pause, fast forward, normal speed reverse, and fast reverse access to a stored video bitstream should be supported.

Intra-only coding of video frames should be supported.

Error Resilience

Video bitstream segmentation and packetization methods for the target networks shall be developed. The video layer should be designed in a way such that relevant error resilience measures can effectively be applied at the network layer for networks needing error recovery. Proper balance of increase in complexity, loss in coding efficiency and benefits achieved by the error resilience measures at the coding layer for various networks should be achieved.

Buffer Models

Buffer models, including hypothetical reference decoders (HRDs), shall be specified for target applications.

Interface to System Layers

The HVC codec shall be designed to permit efficient adaptation and integration with the target system and delivery layers.

Scalable Video Coding

The design of the initial phase shall be such that it will possible to add scalable coding tools on top of the single layer system under consideration initially.

Definitions

This section defines terms used within the context of this document.

Conformance Point: A conformance point is a specification of a particular Profile, at a certain Level, at which conformance can be tested. Conformance Points establish normative parts of the HVC codec standard. A HVC encoder operating at a given Profile and Level shall not generate an output which exceeds the decoding capabilities of a decoder compliant with the same Profile and Level. A HVC decoder operating at a given Profile and Level shall correctly decode the output of any encoder that is compliant with the same Profile and Level. For each Profile there shall be at least one Level.

Profile: A Profile is a set of algorithmic tools, representing a particular tradeoff of performance and resource consumption, supporting the requirements of a particular set of applications. Applications which require similar tradeoffs between these parameters should use the same Profile.

Level: A Level describes performance parameters within each Profile, which describe lower limits on decoder capability. Performance parameters typically include maximum picture size, macroblocks per second, bitrate, frame rate, buffer sizes and similar parameters.

References

[1] “The MPEG vision”, ISO/IEC JTC1/SC 29/WG11/N10412, Lausanne, CH, February 2009.


No comments: