Showing posts with label 3d video coding. Show all posts
Showing posts with label 3d video coding. Show all posts

Friday, April 25, 2014

MPEG news: a report from the 108th meeting, Valencia, Spain

This blog post is also available at bitmovin tech blog and SIGMM records.

The 108th MPEG meeting was held at the Palacio de Congresos de Valencia in Spain featuring the following highlights (no worries about the acronyms, this is on purpose and they will be further explained below):
  • Requirements: PSAF, SCC, CDVA
  • Systems: M2TS, MPAF, Green Metadata
  • Video: CDVS, WVC, VCB
  • JCT-VC: SHVC, SCC
  • JCT-3D: MV/3D-HEVC, 3D-AVC
  • Audio: 3D audio 
Opening Plenary of the 108th MPEG meeting in Valencia, Spain.
The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

As indicated above, MPEG is full of (new) acronyms and in order to become familiar with those, I’ve put them deliberately in the overview but I will explain them further below.

PSAF – Publish/Subscribe Application Format

Publish/subscribe corresponds to a new network paradigm related to content-centric networking (or information-centric networking) where the content is addressed by its name rather than location. An application format within MPEG typically defines a combination of existing MPEG tools jointly addressing the needs for a given application domain, in this case, the publish/subscribe paradigm. The current requirements and a preliminary working draft are publicly available.

SCC – Screen Content Coding

I’ve introduced this topic in my previous report and this meeting the responses to the CfP have been evaluated. In total, seven responses have been received which meet all requirements and, thus, the actual standardization work is transferred to JCT-VC. Interestingly, the results of the CfP are publicly available. Within JCT-VC, a first test model has been defined and core experiments have been established. I will report more on this as an output of the next meetings…

CDVA – Compact Descriptors for Video Analysis

This project has been renamed from compact descriptors for video search to compact descriptors for video analysis and comprises a publicly available vision statement. That is, interested parties are welcome to join this new activity within MPEG.

M2TS – MPEG-2 Transport Stream

At this meeting, various extensions to M2TS have been defined such as transport of multi-view video coding depth information and extensions to HEVC, delivery of timeline for external data as well as carriage of layered HEVC, green metadata, and 3D audio. Hence, M2TS is still very active and multiple amendments are developed in parallel.

MPAF – Multimedia Preservation Application Format

The committee draft for MPAF has been approved and, in this context, MPEG-7 is extended with additional description schemes.

Green Metadata

Well, this standard does not have its own acronym; it’s simply referred to as MPEG-GREEN. The draft international standard has been approved and national bodies will vote on it at the JTC 1 level. It basically defines metadata to allow clients operating in an energy-efficient way. It comes along with amendments to M2TS and ISOBMFF that enable the carriage and storage of this metadata.

CDVS – Compact Descriptors for Visual Search

CDVS is at DIS stage and provide improvements on global descriptors as well as non-normative improvements of key-point detection and matching in terms of speedup and memory consumption. As all standards at DIS stage, national bodies will vote on it at the JTC 1 level. 

What’s new in the video/audio-coding domain?
  • WVC – Web Video Coding: This project reached final draft international standard with the goal to provide a video-coding standard for Web applications. It basically defines a profile of the MPEG-AVC standard including those tools not encumbered by patents.
  • VCB – Video Coding for Browsers: The committee draft for part 31 of MPEG-4 defines video coding for browsers and basically defines VP8 as an international standard. This is explains also the difference to WVC.
  • SHVC – Scalable HEVC extensions: As for SVC, SHVC will be defined as an amendment to HEVC providing the same functionality as SVC, scalable video coding functionality.
  • MV/3D-HEVC, 3D-AVC: These are multi-view and 3D extensions for the HEVC and AVC standards respectively.
  • 3D Audio: Also, no acronym for this standard although I would prefer 3DA. However, CD has been approved at this meeting and the plan is to have DIS at the next meeting. At the same time, the carriage and storage of 3DA is being defined in M2TS and ISOBMFF respectively. 
Finally, what’s new in the media transport area, specifically DASH and MMT?

As interested readers know from my previous reports, DASH 2nd edition has been approved has been approved some time ago. In the meantime, a first amendment to the 2nd edition is at draft amendment state including additional profiles (mainly adding xlink support) and time synchronization. A second amendment goes to the first ballot stage referred to as proposed draft amendment and defines spatial relationship description, generalized URL parameters, and other extensions. Eventually, these two amendments will be integrated in the 2nd edition which will become the MPEG-DASH 3rd edition. Also a corrigenda on the 2nd edition is currently under ballot and new contributions are still coming in, i.e., there is still a lot of interest in DASH. For your information – there will be two DASH-related sessions at Streaming Forum 2014.

On the other hand, MMT’s amendment 1 is currently under ballot and amendment 2 defines header compression and cross-layer interface. The latter has been progressed to a study document which will be further discussed at the next meeting. Interestingly, there will be a MMT developer’s day at the 109th MPEG meeting as in Japan, 4K/8K UHDTV services will be launched based on MMT specifications and in Korea and China, implementation of MMT is now under way. The developer’s day will be on July 5th (Saturday), 2014, 10:00 – 17:00 at the Sapporo Convention Center. Therefore, if you don’t know anything about MMT, the developer’s day is certainly a place to be.

Contact:

Dr. Christian Timmerer
CIO bitmovin GmbH | christian.timmerer@bitmovin.net
Alpen-Adria-Universität Klagenfurt | christian.timmerer@aau.at

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):
  • Text of ISO/IEC 13818-1:2013 PDAM 7 Carriage of Layered HEVC (14/05/02) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of Green Metadata (14/04/04) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of 3D Audio (14/04/04) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of additional audio profiles & levels (14/04/04) 
  • Text of ISO/IEC 14496-12:2012 PDAM 4 Enhanced audio support (14/04/04) 
  • TuC on sample variants, signatures and other improvements for the ISOBMFF (14/04/04) 
  • Text of ISO/IEC CD 14496-22 3rd edition (14/04/04) 
  • Text of ISO/IEC CD 14496-31 Video Coding for Browsers (14/04/11) 
  • Text of ISO/IEC 15938-5:2005 PDAM 5 Multiple text encodings, extended classification metadata (14/04/04) 
  • WD 2 of ISO/IEC 15938-6:201X (2nd edition) (14/05/09) 
  • Text of ISO/IEC DIS 15938-13 Compact Descriptors for Visual Search (14/04/18) 
  • Test Model 10: Compact Descriptors for Visual Search (14/05/02) 
  • WD of ARAF 2nd Edition (14/04/18) 
  • Use cases for ARAF 2nd Edition (14/04/18) 
  • WD 5.0 MAR Reference Model (14/04/18) 
  • Logistic information for the 5th JAhG MAR meeting (14/04/04) 
  • Text of ISO/IEC CD 23000-15 Multimedia Preservation Application Format (14/04/18) 
  • WD of Implementation Guideline of MP-AF (14/04/04) 
  • Requirements for Publish/Subscribe Application Format (PSAF) (14/04/04) 
  • Preliminary WD of Publish/Subscribe Application Format (14/04/04) 
  • WD2 of ISO/IEC 23001-4:201X/Amd.1 Parser Instantiation from BSD (14/04/11) 
  • Text of ISO/IEC 23001-8:2013/DCOR1 (14/04/18) 
  • Text of ISO/IEC DIS 23001-11 Green Metadata (14/04/25) 
  • Study Text of ISO/IEC 23002-4:201x/DAM2 FU and FN descriptions for HEVC (14/04/04) 
  • Text of ISO/IEC 23003-4 CD, Dynamic Range Control (14/04/11) 
  • MMT Developers’ Day in 109th MPEG meeting (14/04/04) 
  • Results of CfP on Screen Content Coding Tools for HEVC (14/04/30) 
  • Study Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/06/06) 
  • HEVC RExt Test Model 7 (14/06/06) 
  • Scalable HEVC (SHVC) Test Model 6 (SHM 6) (14/06/06) 
  • Report on HEVC compression performance verification testing (14/04/25) 
  • HEVC Screen Content Coding Test Model 1 (SCM 1) (14/04/25) 
  • Study Text of ISO/IEC 23008-2:2013/PDAM4 3D Video Extensions (14/05/15) 
  • Test Model 8 of 3D-HEVC and MV-HEVC (14/05/15) 
  • Text of ISO/IEC 23008-3/CD, 3D audio (14/04/11) 
  • Listening Test Logistics for 3D Audio Phase 2 (14/04/04) 
  • Active Downmix Control (14/04/04) 
  • Text of ISO/IEC PDTR 23008-13 Implementation Guidelines for MPEG Media Transport (14/05/02) 
  • Text of ISO/IEC 23009-1 2nd edition DAM 1 Extended Profiles and availability time synchronization (14/04/18) 
  • Text of ISO/IEC 23009-1 2nd edition PDAM 2 Spatial Relationship Description, Generalized URL parameters and other extensions (14/04/18) 
  • Text of ISO/IEC PDTR 23009-3 2nd edition DASH Implementation Guidelines (14/04/18) 
  • MPEG vision for Compact Descriptors for Video Analysis (CDVA) (14/04/04) 
  • Plan of FTV Seminar at 109th MPEG Meeting (14/04/04) 
  • Draft Requirements and Explorations for HDR /WCG Content Distribution and Storage (14/04/04) 
  • Working Draft 2 of Internet Video Coding (IVC) (14/04/18) 
  • Internet Video Coding Test Model (ITM) v 9.0 (14/04/18) 
  • Uniform Timeline Alignment (14/04/18) 
  • Plan of Seminar on Hybrid Delivery at the 110th MPEG Meeting (14/04/04) 
  • WD 2 of MPEG User Description (14/04/04)

Thursday, August 2, 2012

MPEG news: a report from the 101st meeting, Stockholm, Sweden

The 101st MPEG meeting was held in Stockholm, Sweden, July 16-20, 2012. The official press release can be found here and I would like to highlight the following topics:
  • MPEG Media Transport (MMT) reaches Committee Draft (CD)
  • High-Efficiency Video Coding (HEVC) reaches Draft International Standard (DIS)
  • MPEG and ITU-T establish JCT-3V
  • Call for Proposals: HEVC scalability extensions
  • 3D audio workshop
  • Green MPEG
MMT goes CD

The Committee Draft (CD) of MPEG-H part 1 referred to as MPEG Media Transport (MMT) has been approved and will be publicly available after an editing period which will end Sep 17th. MMT comprises the following features:
  • Delivery of coded media by concurrently using more than one delivery medium (e.g., as it is the case of heterogeneous networks).
  • Logical packaging structure and composition information to support multimedia mash-ups (e.g., multiscreen presentation).
  • Seamless and easy conversion between storage and delivery formats.
  • Cross layer interface to facilitate communication between the application layers and underlying delivery layers.
  • Signaling of messages to manage the presentation and optimized delivery of media.
This list of 'features' may sound very high-level but as the CD usually comprises stable technology and is publicly available, the research community is more than welcome to evaluate MPEG's new way of media transport. Having said this, I would like to refer to the Call for Papers of  JSAC's special issue on adaptive media streaming which is mainly focusing on DASH but investigating its relationship to MMT is definitely within the scope.

HEVCs' next step towards completion: DIS

The approval of the Draft International Standard (DIS) brought the HEVC standard one step closer to completion. As reported previously, HEVC shows inferior performance gains compared to its predecessor and real-time software decoding on the iPad 3 (720p, 30Hz, 1.5 Mbps) has been demonstrated during the Friday plenary [1, 2]. It is expected that the Final Draft International Standard (FDIS) is going to be approved at the 103rd MPEG meeting in January 21-25, 2013. If the market need for HEVC is only similar as it was when AVC was finally approved, I am wondering if one can expect first products by mid/end 2013. From a research point of view we know - and history is our witness - that improvements are still possible even if the standard has been approved some time ago. For example, the AVC standard is now available in its 7th edition as a consolidation of various amendments and corrigenda.

JCT-3V

After the Joint Video Team (JVT) which successfully developed standards such as AVC, SVC, MVC and the Joint Collaborative Team on Video Coding (JCT-VC), MPEG and ITU-T establish the Joint Collaborative Team on 3D Video coding extension development (JCT-3V). That is, from now on MPEG and ITU-T also joins forces in developing 3D video coding extensions for existing codecs as well as the ones under development (i.e., AVC, HEVC). The current standardization plan includes the development of AVC multi-view extensions with depth to be completed this year and I assume HEVC will be extended with 3D capabilities once the 2D version is available.

In this context it is interesting that a call for proposals for MPEG Frame Compatible (MFC) has been issued to address current deployment issues of stereoscopic videos. The requirements are available here.

Call for Proposals: SVC for HEVC

In order to address the need for higher resolutions - Ultra HDTV - and subsets thereof, JCT-VC issued a call for proposals for HEVC scalability extensions. Similar to AVC/SVC, the requirements include that the base layer should be compatible with HEVC and enhancement layers may include temporal, spatial, and fidelity scalability. The actual call, the use cases, and the requirements shall become available on the MPEG Web site.

MPEG hosts 3D Audio Workshop

Part 3 of MPEG-H will be dedicated to audio, specifically 3D audio. The call for proposals will be issues at the 102nd MPEG meeting in October 2012 and submissions will be due at the 104th meeting in April 2013. At this meeting, MPEG has hosted a 2nd workshop on 3D audio with the following speakers.
  • Frank Melchior, BBC R&D: “3D Audio? - Be inspired by the Audience!”
  • Kaoru Watanabe, NHK and ITU: “Advanced multichannel audio activity and requirements”
  • Bert Van Daele, Auro Technologies: “3D audio content production, post production and distribution and release”
  • Michael Kelly, DTS: “3D audio, objects and interactivity in games”
The report of this workshop including the presentations will be publicly available by end of August at the MPEG Web site.

What's new: Green MPEG

Finally, MPEG is starting to explore a new area which is currently referred to as Green MPEG addressing technologies to enable energy-efficient use of MPEG standards. Therefore, an Ad-hoc Group (AhG) was established with the following mandates:

  1. Study the requirements and use-cases for energy efficient use of MPEG technology.
  2. Solicit further evidence for the energy savings.
  3. Develop reference software for Green MPEG experimentation and upload any such software to the SVN.
  4. Survey possible solutions for energy-efficient video processing and presentation.
  5. Explore the relationship between metadata types and coding technologies.
  6. Identify new metadata that will enable additional power savings.
  7. Study system-wide interactions and implications of energy-efficient processing on mobile devices.
AhGs are usually open to the public and all discussions take place via email. To subscribe please feel free to join the email reflector.

Monday, January 16, 2012

Top 10 Blog Posts


  1. HTTP Streaming of MPEG Media: My first article in this series which I've started after the MPEG CfP has been issued that lead to the standardization of DASH.
  2. MMSys'11 Special Session on MMT/DASH: the CfP for a special session I've organized.
  3. MPEG news: a report from the 93rd meeting in Geneva, right after the responses to the HTTP streaming CfP has been evaluated.
  4. MPEG advances DASH towards completion which is the MPEG press release after the 94th meeting in Guangzhou.
  5. Open Source Scalable Video Coding (SVC) Software where I have received quite a few comments ;-)
  6. MPEG Media Transport: Basically the same as #1 but a different scope. However, it seems the readers are more interested in HTTP streaming than media transport in general.
  7. Vision and Requirements for High-Performance Video Coding which has been renamed now to High-Efficiency Video Coding.
  8. DASH provides an overview about the Draft International Standard which is publicly available.
  9. MPEG DASH vs. W3C WebTV which is still a hop topic and worth following on both sides...
  10. Immersive Future Media Technologies: From 3D Video to Sensory Experience: I'm happy having this one in my top ten. It's the summary of a tutorial I had at ACM Multimedia 2010 together with Karsten Müller.
In general, most of the readers are very much interested in HTTP streaming / DASH / MMT followed by video coding (SVC/HEVC/3DVC) and the Sensory Experience stuff I've started some time ago.

Thanks again for visiting my blog and don't hesitate to leave a comment here and there. I'd love to read your thoughts and feedback.

Thursday, December 8, 2011

MPEG news: a report from the 98th meeting, Geneva, Switzerland

#MPEG98 report: DASH=IS ✔ CDVS=CfP eval ✔ {MMT, HEVC, 3DAudio}=MPEG-H ✔ IVC={IVC, WebVC} ✔ 3DVC=CfP eval ✔

... MPEG news from its 98th meeting in Geneva, Switzerland with less than 140 characters and a lot of acronyms. The official press release is, as usual, here. As you can see from the press release, MPEG produced significant results, namely:
  • MPEG Dynamic Adaptive Streaming over HTTP (DASH) ratified
  • 3D Video Coding: Evaluation of responses to Call for Proposals
  • MPEG royalty free video coding: Internet Video Coding (IVC) + Web Video Coding (WebVC)
  • High Efficiency Coding and Media Delivery in Heterogeneous Environments: MPEG-H comprising MMT, HEVC, 3DAC
  • Compact Descriptors for Visual Search (CDVS): Evaluation of responses to the Call for Proposals
  • Call for requirements: Multimedia Preservation Description Information (MPDI)
  • MPEG Augmented Reality (AR)
As you can see, a long list of achievements within a single meeting but let's dig inside. For each topic I've also tried to provide some research issues which I think are worth to investigate both inside and outside MPEG. 

MPEG Dynamic Adaptive Streaming over HTTP (DASH): DASH=IS ✔

As the official press release states, the MPEG ratifies its draft standard for DASH and it comes better, the standard should become publicly available which I expect to happen somewhat early next year, approx. March 2012, or maybe earlier. I say "should" because there is no guarantee that this will actually happen but signs are good. In the meantime, feel free using our software to play around and we expect to update it to the latest version of the standard as soon as possible. Finally, IEEE Computer Society Computing Now has put together a theme on Video for the Universal Web featuring DASH.

Research issues: performance, bandwidth estimation, request scheduling (aka adaptation logic), and Quality of Service/Experience.

3D Video Coding: 3DVC=CfP eval ✔

MPEG evaluated more than 20 proposals submitted as a response to the call issued back in April 2011. The evaluation of the proposal comprised subjective quality assessments conducted by 13 highly qualified test laboratories distributed around the world and coordinated by the COST Action IC1003 QUALINET. The report of the subjective test results from the call for proposals on 3D video coding will be available by end of this week. MPEG documented the standardization tracks considered in 3DVC (i.e., compatible with MVC, AVC base-view, HEVC, ...) and agreed on a common software based on the best-performing proposals.

Research issues: encoding efficiency of 3D depth maps and compatibility for the various target formats (AVC, MVC, HEVC) as well as depth map estimation at the client side.

MPEG royalty free video coding: IVC vs. WebVC

In addition to the evaluation of the responses to the call for 3DVC, MPEG also evaluated the responses to the Internet Video Coding call. Based on the responses, MPEG decided to follow up with two approaches namely Internet Video Coding (IVC) and Web Video Coding (WebVC). The former - IVC - is based on MPEG-1 technology which is assumed to be royalty-free. However, it requires some performance boosts in order to make it ready for the Internet. MPEG's approach is a common platform called Internet video coding Test Model (ITM) which serves as the basis for further improvements. The latter - WebVC - is based on the AVC constrained baseline profile which performance is well-known and satisfactory but, unfortunately, it is not clear which patents of the AVC patent pool apply to this profile. Hence, a working draft (WD) of WebVC will be provided (also publicly available) in order to get patent statements from companies. The WD will be publicly available by December 19th.

Further information:
Research issues: coding efficiency with using only royalty free coding tools whereby the optimization is first towards royalties and then efficiency.

MPEG-H

A new star is born which is called MPEG-H referred to as "High Efficiency Coding and Media Delivery in Heterogeneous Environments" comprising three parts: Pt. 1 MMT, Pt. 2 HEVC, Pt. 3 3D Audio. There's a document called context and objective of MPEG-H but I can't find out whether it's public (I come back later on this).

Part 1: MMT (MPEG Media Transport) is progressing (slowly) but a next step should be definitely to check the relationship of MMT and DASH for which an Ad-hoc Group has been established (N12395), subscribe here, if interested.
Research issues: very general at the moment, what is the best delivery method (incl. formats) for future multimedia applications? Answer: It depends, ... ;-)

Part 2: HEVC (High-Efficiency Video Coding) made significant progress at the last meeting, in particular: only one entropy coder (note: AVC has two, CABAC and CAVLC which are supported in different profiles), 8 bit decoding (could be also 10 bit, probably done in some profiles), specific integer transform, stabilized and more complete high-level syntax and HRD description (i.e., reference picture buffering, tiles, slices, and wavefronts enabling parallel decoding process). Finally, a prototype has been demonstrated decoding HEVC in software on an iPad 2 at WVGA resolution and the 10min Big Buck Bunny sequence at SD resolution with avg. 800 kbit/s which clearly outperformed the corresponding AVC versions.
Research issues: well, coding efficiency, what else? The ultimative goal to have a performance gain of more than 50% compared to the predecessor which is AVC.

Part 3: 3D Audio Coding (3DAC) is in its early stages but there will be an event during San Jose meeting which will be announced here. As of now, use cases are provided (home theatre, personal TV, smartphone TV, multichannel TV) as well as candidate requirements and evaluation methods. One important aspect seems to be user experience for highly immersive audio (i.e., 22.2, 10.2, 5.1) including bitstream adaptation for low-bandwidth and low-complexity.
Research issues: sorry, I'm not really an audio guy but I assume it's coding efficiency, specifically for 22.2 channels ;-)

Compact Descriptors for Visual Search (CDVS)

For CDVS, responses to the call for proposals (from 10 companies/institutions) have been evaluated and a test model has been established based on the best performing proposals. The next steps include the improvement of the test model towards for inclusion in the MPEG-7 standard.
Research issues: descriptor efficiency for the intended application as well as precision on the information retrieval results.

Multimedia Preservation Description Information (MPDI)

The aim of this new work item is to provide "standard technology helping users to preserve digital multimedia that is used in many different domains, including cultural heritage, scientific research, engineering, education and training, entertainment, and fine arts for long-term across system, organizational, administrative and generational boundaries". It comes along with two public documents, the current requirements and a call for requirements which are due at the 100th MPEG meeting in April 2002.
Research issues: What and how to preserve digital multimedia information?

Augmented Reality (AR)

MPEG's newest project is on Augmented Reality (AR), starting with an application format for which a working draft exists. Furthermore, draft requirements and use cases are available. These three documents will be available on Dec 31st.
Research issues: N/A

Finally, I hope now you can better understand what I've put at the beginning with all these acronyms ...

#MPEG98 report: DASH=IS ✔ CDVS=CfP eval ✔ {MMT, HEVC, 3DAudio}=MPEG-H ✔ IVC={IVC, WebVC} ✔ 3DVC=CfP eval ✔

Wednesday, April 20, 2011

MPEG issues Call for Proposals on 3D Video Coding Technology

The aim of this Call for Proposals (CfP) is to provide efficient compression and high quality view reconstruction of an arbitrary number of dense views. This CfP has been issued by ISO/IEC JTC1/SC29/WG11 (MPEG) and the evaluation of submissions will be carried out at the 98th MPEG meeting after formal subjective evaluation.

The CfP is publicly available on the MPEG Web site and also here.

The CfP provides all necessary information regarding purpose and procedure, timeline, test material, coding classes, anchors, test conditions and parameters, submission requirements, subjective viewing requirements, test sites and delivery of test material, testing fee, source code and IPR, and contact information.

In the meantime, all details can be discussed within the so-called Ad-hoc Group (AhG) on 3D Video Coding (3DVC) which is available at mpeg-ftv@lists.rwth-aachen.de (to subscribe or unsubscribe, go to http://mailman.rwth-aachen.de/mailman/listinfo/mpeg-ftv).

Saturday, April 16, 2011

MPEG issues call for proposals for visual search

Highlights of the 96th Meeting

MPEG finalizes CfP to standardize mobile visual search technologies

In its latest step toward creating a standard for efficient and interoperable designs of visual search applications, MPEG has issued a Call for Proposals at its 96th meeting. Like a barcode reader, but using regular images instead of barcodes, visual search enables the retrieval of related information from databases for tourists, simplified shopping, mobile augmented reality, and other applications by sending standardized descriptors.

Specifically, the call seeks technologies that deliver robust matching of images of objects, such as landmarks, artworks, and text-based documents, that may be partially occluded or captured from various vantage points, and with different camera parameters, or lighting conditions. The underlying component technologies that are expected to be addressed by the standard include the format of the visual descriptors, and parts of the descriptor extraction process needed to ensure interoperability. Other component technologies, such as indexing and matching algorithms, may also be incorporated into the new standard.

The text of the Call for Proposals is available here. Responses are due shortly before and will be evaluated at the 98th MPEG meeting in Geneva.

MPEG plans April 18 CfP for 3D video coding

A Call for Proposals on 3D Video Coding Technology is planned to be issued by MPEG. This call invites technology submissions providing efficient compression of 3D video and high quality view reconstruction that goes beyond the capabilities of existing standards. MPEG has already delivered 3D compression formats to the market, including MVC and frame-compatible stereoscopic formats, which are being deployed by industry for packaged media and broadcast services. However, the market needs are expected to evolve and new types of 3D displays and services will be offered. With this call, MPEG embarks on a new phase of 3D standardization that anticipates these future needs. The next-generation of 3D standards will define the 3D data format and associated compression technology to facilitate the generation of multiview output to enable both advanced stereoscopic display processing and improved support for auto-stereoscopic displays. Further details are outlined in MPEG's Vision on 3D Video, which is available online at http://mpeg.chiariglione.org/visions/3dv/index.htm.

The text of the Call for Proposals will be available here. Responses to this call are due in September 2011 and will be evaluated at the 98th MPEG meeting in Geneva.

MPEG continues with CfP for Type-1 Video Coding Standard

As announced in January, MPEG will develop a new video compression standard in line with the expected usage models of the Internet. The new standard is intended to achieve substantially better compression performance than that offered by MPEG-2 and possibly comparable to that offered by the AVC Baseline Profile. MPEG issued a Draft Call for Proposals (CfP) for Internet Video Coding Technologies that is expected to lead to a standard falling under ISO/IEC “Type-1 licensing”, i.e. intended to be “royalty free”. Proposals are due in October 2011 and will be evaluated at the 98th MPEG meeting in Geneva. It is expected that this standard will become the default video codec for internet applications.

The text of the Call for Proposals is available here. Responses to this call are due in October 2011 and will be evaluated at the 98th MPEG meeting in Geneva.

MPEG augments its reconfigurable framework with graphics

At its 96th meeting, MPEG has decided to extend the set of tools available in ISO/IEC 23001-4 and ISO/IEC 23002-4 used to describe a RVC (Reconfigurable Video Coding) framework, to now include graphics specific elements. Therefore, what was previously envisioned as RVC is now RMC (Reconfigurable Multimedia Coding), the latter containing both the already standardized VTL (Video Tool Library) and the library currently being developed for graphics, GTL (Graphics Tool Library). The purpose of GTL is to specify the Functional Units for the compression of static and animated 3D graphic objects and to allow their configuration to build reconfigurable decoders.

Digging Deeper – How to Contact MPEG

Communicating the large and sometimes complex array of technology that the MPEG Committee has developed is not a simple task. Experts, past and present, have contributed a series of tutorials and vision documents that explain each of these standards individually. The repository is growing with each meeting, so if something you are interested is not there yet, it may appear there shortly – but you should also not hesitate to request it. You can start your MPEG adventure at: http://mpeg.chiariglione.org/technologies.htm.

Further Information

Future MPEG meetings are planned as follows:

No. 97, Torino, IT, 18-22 July, 2011
No. 98, Geneva, CH 28 November – 2 December, 2011
No. 99, San Jose, USA 6-10 February, 2012
For further information about MPEG, please contact:

Dr. Leonardo Chiariglione (Convener of MPEG, Italy)
Via Borgionera, 103
10040 Villar Dora (TO), Italy
Tel: +39 011 935 04 61
leonardo@chiariglione.org
This press release and other MPEG-related information can be found on the MPEG homepage:

http://mpeg.chiariglione.org/

The text and details related current Calls are in the Hot News section, http://mpeg.chiariglione.org/hot_news.htm. These documents include information on how to respond to Calls.

The MPEG homepage also has links to other MPEG pages which are maintained by the MPEG subgroups. It also contains links to public documents that are freely available for download by those who are not MPEG members. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Arianne T. Hinds at arianne.hinds@infoprint.com.

Friday, February 11, 2011

MPEG envisages royalty-free MPEG video coding standard

Daegu, KR – The 95th MPEG meeting was held in Daegu, Korea from the 24th to the 28th of January 2011.
--MPEG press release also available here.

Highlights of the 95th Meeting

MPEG anticipates March 2011 CfP for Type-1 Video Coding Standard
MPEG has been producing standards that provide industry with the best video compression technologies. In recognition of the growing importance that the Internet plays in the generation and consumption of video content, MPEG intends to develop a new video compression standard in line with the expected usage models of the Internet. The new standard is intended to achieve substantially better compression performance than that offered by MPEG-2 and possibly comparable to that offered by the AVC Baseline Profile. MPEG will issue a call for proposals on video compression technology at the end of its upcoming meeting in March 2011 that is expected to lead to a standard falling under ISO/IEC “Type-1 licensing”, i.e. intended to be “royalty free”.

MPEG moves toward a visual search standard by issuing Draft Call for Proposals
In its latest step toward creating a standard for efficient and interoperable designs of visual search applications, MPEG has issued a draft Call for Proposals at its 95th meeting. Like a barcode reader, but using regular images instead of barcodes, visual search enables the retrieval of related information from databases for tourists, simplified shopping, mobile augmented reality, and other applications.

Specifically, the call seeks technologies that deliver robust matching of images of objects, such as landmarks and text-based documents, that may be partially occluded or captured from various vantage points, and with different camera parameters, or lighting conditions. The underlying component technologies that are expected to be addressed by the standard include the format of the visual descriptors, and parts of the descriptor extraction process needed to ensure interoperability. Other component technologies, such as indexing and matching algorithms, may also be incorporated into the new standard.

Further details are outlined in the text of the call available at http://mpeg.chiariglione.org/hot_news.htm. The Final Call for Proposals will be issued at the 96th MPEG meeting in March 2011 with responses due in October 2011.

MPEG targets a new phase of 3D video coding standards
A Draft Call for Proposals on 3D Video Coding Technology has also been issued by MPEG at its 95th meeting. This call invites technology submissions providing efficient compression of 3D video and high quality view reconstruction that goes beyond the capabilities of existing standards. MPEG has already delivered 3D compression formats to the market, including MVC and frame-compatible stereoscopic formats, which are being deployed by industry for packaged media and broadcast services. However, the market needs are expected to evolve and new types of 3D displays and services will be offered. With this call, MPEG embarks on a new phase of 3D standardization that anticipates these future needs. The next-generation of 3D standards will define the 3D data format and associated compression technology to facilitate the generation of multiview output to enable both advanced stereoscopic display processing and improved support for auto-stereoscopic displays. Further details are outlined in MPEG's Vision on 3D Video (http://mpeg.chiariglione.org/visions/3dv/index.htm). The Final Call for Proposals will be issued at the 96th MPEG meeting in March 2011 with responses due in September 2011.

Amendment to MPEG-2 systems is finalized at 95th meeting
MPEG is continuously improving the popular MPEG-2 Transport Stream (TS) standard (ISO/IEC 13818-1), one of its most widely accepted standards for broadcast industries. At its 95th meeting, MPEG has finalized a new amendment to support recently developed video coding standards, Advanced Video Coding (AVC) and Multiview Video Coding (MVC), in MPEG-2 TS. This amendment extends the AVC video descriptor to signal the presence of a frame packing arrangement in an associated supplemental enhancement information message for the underlying AVC video stream component. The new amendment also adds signaling of an operating point descriptor of MVC which enables transmission systems to convey the relevant operating points that can be used by receiving devices.

In a related project, MPEG has also started a new amendment to signal stereoscopic video services carried in MPEG-2 TS. This amendment will support not only frame compatible video services but also service compatible video services which will allow implementation of backward compatible stereoscopic video services in HDTV systems.

MPEG hosts MPEG-V awareness event
At its 95th meeting, MPEG hosted the MPEG-V Awareness Event 2011, at which the full range of MPEG-V technologies, including several products and applications employing the standard, were showcased. These technologies cover applications for multi-sensorial user experience in the home environment, control of virtual worlds by real signals, motion capture systems and real-time avatar animation, multi-platform streaming for virtual worlds and mixed reality games. The workshop presentations are available at http://wg11.sc29.org/mpeg-v.


A hot standard moves fast
MPEG has approved the promotion of Dynamic Adaptive Streaming over HTTP (DASH) to Draft International Standard (DIS) status. The draft is available from the Hot News page of http://mpeg.chiariglione.org.

Responding to a Call – How to Contact MPEG
The text and details related to the Calls mentioned above (together with other current Calls) are in the Hot News section, http://www.chiariglione.org/mpeg/hot_news.htm. These documents include information on how to respond to the Calls.

Communicating the large and sometimes complex array of technologies that the MPEG Committee has developed is not a simple task. Experts, past and present, have contributed a series of tutorials and vision documents that explain each of these standards individually. The repository is growing with each meeting, but if something of interest cannot be found, do not hesitate to request it. You can start your MPEG adventure at: http://mpeg.chiariglione.org/technologies.htm.

Further Information

Future MPEG meetings are planned as follows:

  • No. 96, Geneva, CH, 21-25 March, 2011
  • No. 97, Torino, IT, 18-22 July, 2011
  • No. 98, Geneva, CH 28 November – 2 December, 2011
  • No. 99, San Jose, USA 6-10 February, 2012

For further information about MPEG, please contact:

Dr. Leonardo Chiariglione (Convener of MPEG, Italy)
Via Borgionera, 103
10040 Villar Dora (TO), Italy
Tel: +39 011 935 04 61
leonardo@chiariglione.org

This press release and other MPEG-related information can be found on the MPEG homepage:
http://mpeg.chiariglione.org/

The MPEG homepage also has links to other MPEG pages which are maintained by the MPEG subgroups. It also contains links to public documents that are freely available for download by those who are not MPEG members. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Arianne T. Hinds at arianne.hinds@infoprint.com.

Monday, August 30, 2010

Immersive Future Media Technologies: From 3D Video to Sensory Experience

--tutorial to be held at ACM Multimedia 2010, Oct 25th, Morning, Florence, Italy
--download as PDF

For registration details, please consult ACM Multimedia 2010 Web site!

Instructors: Karsten Müller (Fraunhofer/Heinrich-Hertz-Institut, Berlin) and Christian Timmerer (Klagenfurt University, Austria)

Abstract: The past decade has witnessed a significant increase in the research efforts around the Quality of Experience (QoE) which is generally referred to as a human-centric paradigm for the Quality of a Service (QoS) as perceived by the (end) user. As it puts the end user in the center stage, it may have various dimensions and one dimension recently gained momentum is 3D video. Another dimension aims at going beyond 3D and promises advanced user experience through sensory effects, both introduced briefly in the following.

3D Video: Stereo and Multi-View Video Technology
3D related media technologies have recently developed from pure research-oriented work towards applications and products. 3D content is now being produced on a wider scale and first 3D applications have been standardized, such as multi-view video coding for 3D Blu Ray disks. This part of the tutorial starts with an overview on 3D in the form of stereo video based systems, which are currently being commercialized. Here, stereo formats and associated coding are introduced. This technology is used for 3D cinema applications and mobile 3DTV environments. For the latter, user requirements and profiling will be introduced as a form to assess user quality of experience. For 3D home entertainment, glasses-free multi-view displays are required, as more than one user will watch 3D content. For such displays, the current stereo solutions need to be extended. Therefore, new activities in 3D video are introduced. These 3D solutions will develop a generic 3D video format with color and supplementary geometry data, e.g. depth maps, and associated coding and rendering technology for any multi-view display, independent of the number of views. As such technology is also developed in international consortia, the most prominent, like the 3D@HOME consortium, the EU 3D, Immersive, Interactive Media Cluster and the 3D video activities in ISO-MPEG are introduced.

Advanced User Experience through Sensory Effects
This part of the tutorial addresses a novel approach for increasing the user experience – beyond 3D – through sensory effects. The motivation behind this work is that the consumption of multimedia assets may stimulate also other senses than vision or audition, e.g., olfaction, mechanoreception, equilibrioception, or thermoception that shall lead to an enhanced, unique user experience. This could be achieved by annotating the media resources with metadata (currently defined by ISO/MPEG as part of the MPEG-V standard) providing so-called sensory effects that steer appropriate devices capable of rendering these effects (e.g., fans, vibration chairs, ambient lights, perfumer, water sprayers, fog machines, etc.). In particular, we will review the concepts and details of the forthcoming MPEG-V standard and present our prototype architecture for the generation, transport, decoding and use of sensory effects. Furthermore, we will present details and results of a series of formal subjective quality assessments which confirm that the concept of sensory effects is a vital tool for enhancing the user experience.

Course Outline
Introduction and Overview (~10min.)
3D Video: Stereo and Multi-View Video Technology (~90min)
  • User Requirements for 3D video technologies
  • Stereo Video solutions for 3D cinema and mobile applications
  • User experience and profiling for mobile 3DTV
  • Format description and coding for stereo and multi-view video technology
  • Depth-enhanced 3D video solutions for 3D home entertainment
Advanced User Experience through Sensory Effects (~90min)
  • MPEG-V: context and objectives including an overview of all parts.
  • In depth review of Part 3 of MPEG-V entitled Sensory Information.
  • A test-bed for the quality of multimedia experience evaluation of sensory effects and demonstration.
  • How to improve the Quality of Experience through sensory effects? Results from first subjective experiments.
  • Conclusions and future work.
Intended Audience
Intermediate, specifically, graduate students and researchers interested in 3D video and sensory experiences.

Biography of Presenters
Karsten Müller received the Dipl.-Ing. and Dr.-Ing. degree from the Technical University of Berlin, Germany, in 1997 and 2006 respectively. In 1993 he spent one year studying Electronics and Communication Engineering at Napier University of Edinburgh/Scotland, including a half-year working period at Integrated Communication Systems Inc. in Westwick near Cambridge/England. In this working period he developed software for voice mail systems and statistical analysis of caller data. In 1996 he joined the Heinrich-Hertz-Institute (HHI) Berlin, Image Processing Department, were he is a project coordinator for European projects in the field of 3D media technology. He also co-chairs the European 3D Media Cluster, which serves as a contact gateway for information exchange between the associated European projects and international 3D media activities.
His research interests include motion and disparity estimation, 3D media representation and coding, 3D graphics-based scene reconstruction with multi-texture surfaces, and 3D metadata and content description. He has been actively involved in MPEG activities, standardizing the multi-view description for MPEG-7, the view-dependent multi-texturing methods for the 3D scene representation in MPEG4-AFX and contributing to the multi-view video coding process in MPEG4-MVC and 3D Video. In recent projects he was involved in research and development of traffic surveillance systems and visualization of multiple-view video, 3D scene reconstruction, object segmentation, tracking and 3D reconstruction, 3D scene and object representation and interactive user navigation in 3D environments. Currently, he is a Project Manager for European projects in the field of 3D video technology and multimedia content description. He is senior member of the IEEE.

Christian Timmerer received his M.Sc. (Dipl.-Ing.) in January 2003 and his Ph.D. (Dr.techn.) in June 2006 (for research on the adaptation of scalable multimedia content in streaming and constraint environments) both from the Klagenfurt University. He joined the Klagenfurt University in 1999 and is currently a Assistant Professor (Ass.-Prof.) at the Department of Information Technology (ITEC) – Multimedia Communication Group. His research interests include the transport of multimedia content, multimedia adaptation in constrained and streaming environments, distributed multimedia adaptation, and QoS/QoE. He has published more than 50 papers (incl. book chapters and tutorials) in these areas and the general chair of WIAMIS2008. Finally, he is an editorial board member of the Encyclopedia of Multimedia, ACM/Springer International Journal on Multimedia Tools and Applications (MTAP), and associate editor for IEEE Computer Science Computing Now.
He has been actively participating in several EC-funded projects, notably the FP6-IST-DANAE (2004-2006), FP6-IST-ENTHRONE (2006-2008), FP7-ICT-P2P-Next (2008-2012), and FP7-ICT-ALICANTE (2010-2013) projects. Dr. Timmerer participated in the work of ISO/MPEG for several years, notably as the head of the Austrian delegation, coordinator of several core experiments, co-chair of several ad-hoc groups, and as an editor for Parts 7 and 8 of MPEG-21, Digital Item Adaptation and Reference Software for which he received ISO/IEC certificates. His current contributions are in the area of the MPEG Extensible Middleware (MXM), MPEG-V (Media Context and Control), Advanced IPTV Terminal (AIT), and MPEG Media Transport (MMT) for which (i.e., MXM and MPEG-V) he also serves as an editor. Publications and MPEG contributions can be found under http://research.timmerer.com.

Monday, February 8, 2010

References about the Multi-view Video Coding (MVC) standard and related technology principles

G. J. Sullivan, "Standards-based approaches to 3D and multiview video coding", SPIE Applications of Digital Image Processing XXXII, Aug. 2009.

A. Vetro, S. Yea, M. Zwicker, W. Matusik, H. Pfister, "Overview of multiview video coding and anti-aliasing for 3D displays", IEEE Int'l Conf on Image Proc., Sept. 2007.
(esp. section 2 - http://www.merl.com/reports/docs/TR2007-027.pdf)

A. Vetro, S. Yea, and A. Smolic, "Towards a 3D video format for auto-stereoscopic displays", SPIE Conference on Applications of Digital Image Processing XXXI, Aug. 2008.
(esp. section 2.4 - http://www.merl.com/papers/docs/TR2008-057.pdf)

Philipp Merkle, Aljoscha Smolic, Karsten Müller, and Thomas Wiegand: Efficient Prediction Structures for Multiview Video Coding, IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Multi-view Video Coding and 3DTV, vol. 17, no. 11, pp. 1461-1473, November 2007
(http://ip.hhi.de/imagecom_G1/assets/pdfs/ieee07_Prediction_MVC.pdf)

Philipp Merkle, Karsten Müller, Aljoscha Smolic, and Thomas Wiegand: Efficient Compression of Multi-View Video Exploiting Inter-View Dependencies Based on H.264/MPEG4-AVC,
IEEE International Conference on Multimedia and Expo (ICME'06), Toronto, Ontario, Canada, July 2006.
(http://ip.hhi.de/imagecom_G1/assets/pdfs/h264_multi_view.pdf)

Aljoscha Smolic, Karsten Müller, Philipp Merkle, Christoph Fehn, Peter Kauff, Peter Eisert, and Thomas Wiegand: 3D Video and Free Viewpoint Video - Technologies, Applications and MPEG Standards, Proceedings of International Conference on Multimedia and Expo (ICME 2006), Toronto, Canada, pp. 2161-2164, July 2006.

Karsten Müller, Philipp Merkle, Heiko Schwarz, Tobias Hinz, Aljoscha Smolic, and Thomas Wiegand:
Multi-view Video Coding Based on H.264/AVC Using Hierarchical B-Frames, Picture Coding Symposium (PCS'06), Beijing, China, April 2006.

Ying Chen, Ye-Kui Wang, Kemal Ugur, Miska M. Hannuksela, Jani Lainema, and Moncef Gabbouj, “3D video services with the emerging MVC standard”, EURASIP Journal on Advances in Signal Processing, Volume 2009 (2009), Article ID 786015, 13 pages, doi:10.1155/2009/786015.

Thursday, January 28, 2010

Overview of Selected Current MPEG Activities

--this covers a report from the 91st MPEG meeting in Kyoto, Japan

Previously, I've always provided a written report but this time it comes in form of a presentation (slideshow) - enjoy!

Saturday, October 31, 2009

MPEG news: a report from the 90th meeting in Xi'an, China

The 90th MPEG meeting in Xi’an, China is coming up with some very interesting news which are briefly highlighted here. First and, I think, most importantly, the timeline for the new MPEG/ITU-T video coding format has been discussed and it seems the final Call for Proposals (CfP) will be ready in January 2010. A draft CfP is available now and hopefully will be also publicly available if they solve all the editing issues until early November. This means that the proposals will be evaluated in April 2010 (note: this will be a busy meeting as a couple of other calls need to be evaluated too; see later). The CfP defines five classes of test sequences with the following characteristics (number of sequences available in brackets):
  • Class A with 2560x1600 cropped from 4Kx2K (2);
  • Class B with 1920x1080p at 24/50-60 fps (5);
  • Class C with 832x480 WVGA (4);
  • Class D with 416x240 WQVGA (4); and
  • Class E with 1280x720p at 50-60fps (3).
For classes B, C, and E subjective tests will be performed whereas classes A and D will be only evaluated objectively using PSNR. The reason for evaluating A and D using objective measurements is due to its insignificant subjective differences with B and C respectively. Finally, they’re still discussing about the actual common nickname name of the standard as it seems some are not happy with high-performance video coding but that’s yet another story…

Second, 3D video coding is still a major topic in MPEG but you probably need to wait yet another year until a Call for Proposals will be issued. That is, a 3DV standard will be probably available around the beginning of 2013 at the earliest. The major issue right now is the availability of content – as usual – and different device manufacturer standards with respect to 3D video.

The third major topic at this meeting was around AIT and MMT, two acronyms you shall become more familiar in the future. The former is referred to as Advanced IPTV Terminal (AIT) and aims to develop an ecosystem for media value chains and networks. Therefore, basic (atomic) services will be defined including protocols (payload formats) to enable users to call these services, Application Programming Interfaces to access services, and bindings to specific programming languages. Currently, 30 of these basic services are foreseen which can be clustered in services pertaining to the identification, authentication, description, storage, adaptation, posting, packaging, delivery, presentation, interaction, aggregation, management, search, negotiation, and transaction. The timeline is similar as for HVC which means that proposals will be evaluated in April 2010. The latter is referred to as MPEG Media Transport (MMT) and basically aims to become a successor of the well-known MPEG-2 Transport Stream. Currently, two topics are explored for which also requirements have been formulated. The first topic covers adaptive, progressive transport and the second topic is in the area of cross-layer design. Further topics where this activity might look into are hybrid delivery and conversational services. As for HVC and AIT, the proposals are going to be evaluated in April 2010. However, in order to further refine this possible new work item, MPEG will held a workshop in January 2010 on the Wednesday during the Kyoto meeting focusing on “adaptive progressive transport” and “cross-layer design”.

However, MPEG is looking forward to a very busy meeting in April 2010 which by the way will be held in Dresden, Germany.

Another issue that has been discussed in Xi’an was (again) the development of a royalty free codec within MPEG. While some might say that within MPEG, trying to establish a royalty free codec is a first step towards failure, others argue that MPEG-1 is already royalty free, for MPEG-2 most patents expire in 2011, the Internet community is requesting this (e.g., IETF established coded group and Google has chosen On2, a royalty free codec), and, finally, MPEG-4 Part 10 royalty free baseline basically failed. Thus, maybe (or hopefully) this is the right time for a royalty free codec within MPEG and who can predict the future? Anyway, there’s some activity going on in this area and if you’re interested, stay tuned…

Finally, I’d like to note that MPEG-V (Media Context & Control) and MPEG-U (Rich Media User Interface) are progressing smoothly and both going hand in hand towards its finalization. This meeting, the FCDs have been approved which forms a major milestone as this was the last chance for substantial new contributions. One such input was related to advanced user interaction like the Wiimote, etc. which will become part of MPEG-V but used also by MPEG-U. Hence, one might argue merging these two standards into one single standard called MPEG-W (i.e., U+V=W) and a wedding ceremony could be performed at the next meeting in Kyoto with Geishas as witnesses … why not? Please raise your voice now or be silent forever!

Wednesday, February 25, 2009

Call for 3D Test Material: Depth Maps & Supplementary Information

-- complete document can be found here

MPEG has been working towards a new framework that aims to specify a coded representation for 3D scene (including multiview video, depth and supplementary information). This representation specifically targets the generation of high-quality intermediate views at the receiver for auto-stereoscopic displays or stereoscopic display processing. Please refer to MPEG’s vision document on 3D Video [1].

In the process of developing a reference representation and corresponding set of view generation techniques, multiview video test data has been collected and is available at the links as listed in Appendix A. However, there is currently a lack of high-quality depth map data for these test sequences. Automatic depth estimation techniques have not yet been able to provide sufficient accuracy and robustness for high quality view synthesis. We therefore call for high quality depth map data for the multiview test sequences in Appendix A. Further requirements are specified in the next section.

In addition to depth maps, supplementary information such as background data, occlusion, transparency, and segmentation masks, are also being called for, when available.

New stereo and multiview test sequences that fulfil the requirements for test material described in [2] are also welcome together with appropriate depth maps.

Data Requirements
This section outlines the requirements for depth map and supplementary data. This data may be created by any means, including semi-automatic and manual generation methods. Most popular uncompressed image/video formats including RGB, TIF, PNG, RAW and YUV data formats would be acceptable. Documentation for any non-standard formats should be provided.

Depth Data
Data for multiview depth videos (i.e., monochrome depth sequences) of a scene from different viewpoints and view directions are sought. It is desirable to receive one corresponding depth video per video view, but depth maps for a subset of input video views (at least two non-adjacent views) are also welcome. It is also desirable that the depth maps have pixel-level accuracy that closely corresponds to the objects in the scene. Furthermore, the depth maps should have consistency over time.

The necessary data for correct interpretation of depth values shall be provided. This includes near and far clipping planes (Z_near, Z_far) as well as definition if the given data are directly Z values or rather 1/Z, preferably. If any other type of data is provided, this shall be fully specified, including algorithms how to convert to Z values.

Supplementary Data
In addition to data for multiview depth videos of a scene, other supplementary data (such as occlusion textures, occlusion depth maps and transparencies) from different viewpoints and view directions are sought. It is desirable to receive this supplementary data per video view, but supplementary data for a subset of input video views are also welcome. It is also desirable that supplementary data have pixel-level accuracy that closely corresponds to the objects in the scene. Furthermore, the supplementary data should have consistency over time.

The necessary data for correct interpretation of the supplementary data shall be provided. In contrast to depth maps, in which there exist known methods and software tools to use depth for view synthesis, it is expected that accompanying software to perform view synthesis based on the supplementary data would be provided. This is essential to properly evaluate the effectiveness of the data and related techniques to be used a reference.

Copyright
The test material (and any accompanying software) should be provided free of charge and available for use by members of the standardization committee and respondents to a future Call for Proposals related to the development of a related 3D Video standard. Even more desirable would be a free donation to the scientific community, i.e. allowance to be used for publications etc. Donators should provide a copyright notice with any contributed material making term of usage clear. An example is given in the Appendix B.

Logistics & Contact
MPEG would like to receive contributions in time for the 88th MPEG meeting to be held in Maui, USA from April 20-24, 2009. It is requested that a document be submitted to the next meeting that includes a link to relevant test materials and details further information about the materials being contributed. Those interested in making a contribution are advised to contact the persons listed below for further details.

It would be ideal for contributors to attend the meeting in person in order to allow discussions on details of the contributions. Although regular participation to MPEG meetings is subject to some guidelines, non-MPEG respondents will be allowed to participate in this meeting.

Prof. Jens-Rainer Ohm
(MPEG Video Subgroup Chair)
RWTH Aachen University
Institut für Nachrichtentechnik, 52056 Aachen, Germany
Phone: +49-241-80-27671
Fax: +49-241-80-22196
Email: ohm@ient.rwth-aachen.de

Dr. Anthony Vetro
(3D Video Ad-hoc Group Chair)
Mitsubishi Electric Research Labs
201 Broadway, 8th Floor
Cambridge, MA 02139 USA
Phone: +1-617-621-7591
Fax: +1-617-621-7550
Email: avetro@merl.com

References
[1] MPEG video group, “3D Video Vision,” ISO/IEC JTC1/SC29/WG11 N103xx, Lausanne, Switzerland, January 2009.
[2] MPEG video group, “Call for Contributions on 3D Video Test Material,” ISO/IEC JTC1/SC29/WG11 N9595, Antalya, Turkey, January 2008.

Monday, February 9, 2009

MPEG's Vision on 3D Video Coding

The following text comprises the 'Vision on 3D Video Coding' approved by MPEG's Video and Requirements Sub-groups with the full reference: ISO/IEC JTC1/SC29/WG11/N10357, Lausanne, Switzerland, February 2009. Also publicly available here.

MPEG has developed a suite of international standards to support 3D services and devices, and now initiates a new phase of standardization to be completed within the next two years.
  • One objective is to enable stereo devices to cope with varying display types and sizes, and different viewing preferences. This includes the ability to vary the baseline distance for stereo video to adjust the depth perception, which could help to avoid fatigue and other viewing discomforts.
  • MPEG also envisions that high-quality auto-stereoscopic displays will enter the consumer market in the next few years. Since it is difficult to directly provide all the necessary views due to production and transmission constraints, a new format is needed to enable the generation of many high-quality views from a limited amount of input data, e.g. stereo and depth.
Our vision is a new 3D Video (3DV) format that goes beyond the capabilities of existing standards to enable both advanced stereoscopic display processing and improved support for auto-stereoscopic N-view displays, while enabling interoperable 3D services. This is illustrated in Figure 1 and further details are described below.
Figure 1. Target of 3D Video format illustrating limited camera inputs and constrained rate transmission according to a distribution environment. The 3DV data format aims to be capable of rendering a large number of output views for auto-stereoscopic N-view displays and support advanced stereoscopic processing.

Due to limitations in the production environment, the 3DV data format is assumed to be based on limited camera inputs; stereo content is most likely, but more views might also be available. In order to support a wide range of auto-stereoscopic displays, it should be possible for a large number of views to be generated from this data format. Additionally, the rate required for transmitting the 3DV format should be fixed to the distribution constraints, i.e., there should not be an increase in the rate simply because the display requires a higher number of views to cover a larger viewing angle. In this way, the transmission rate and the number of output views are decoupled. Advanced stereoscopic processing that requires view generation at the display would also be supported by this format.
Compared to the existing coding formats, the 3DV format has several advantages in terms of bit rate and 3D rendering capabilities, which is also illustrated in Figure 2.
  • 2D+Depth, as specified by ISO/IEC 23002-3 (and also referred to as MPEG-C Part 3), supports the inclusion of depth for generation of an increased number of views. While it has the advantage of being backward compatible with legacy devices and is agnostic of coding formats, it is only capable of rendering a limited depth range since it does not directly handle occlusions. The 3DV format expects to enhance the 3D rendering capabilities beyond this format.
  • Multiview Video Coding (MVC), as specified by ISO/IEC 14496-10 | ITU-T Recommendation H.264, supports the direct coding of multiple views and exploits inter-camera redundancy to reduce the bit rate. Although MVC is more efficient than simulcast, the rate of MVC encoded video is proportional to the number of views. The 3DV format expects to significantly reduce the bit rate needed to generate the required views at the receiver.
Figure 2. Illustration of 3D rendering capability versus bit rate for different formats, where 3D Video aims to improve rendering capability of 2D+Depth format while reducing bit rate requirements relative to simulcast and MVC.

Friday, February 6, 2009

MPEG news: a report from the 87th meeting in Lausanne, Switzerland

MPEG’s high-performance video coding (HVC) standard is evolving and currently targets mobile devices, IPTV, and Ultra-HD. However, the trade-off between coding efficiency and codec complexity is still driving the thresholds which has been set at this meeting, well, at least initially. For low-complexity HVC is seeking for 25% gain in coding efficiency and for full-/increased-complexity the threshold is set to 50% gain in coding efficiency (cf. goal for AVC standardization). The application scenarios range from shared Ultra-HD to personalized experiences. The latter is targeting a viewing distance of 0.5*h (i.e., 50cm) and for personal use only. A vision document (N10361) has been issued at this meeting and interested parties are requested to join the Ad-hoc Group (AhG) for further discussions (N10371). One open issue with HVC is whether and how audio will adapt to these new developments, probably with high-performance audio coding (HAC) or can HE-AAC already complement HVC? Anyway, short viewing distance ultimately leads to short listening distances and a wide angle for the sound stage...

3D video coding (3DVC) has already a history within MPEG. A vision document has been produced here stating that MPEG’s version of 3D video shall be compatible with existing standards, mono/stereo devices, and existing/planned infrastructure. While the former seems to be very clear, i.e., backwards compatibility to AVC is desirable in the same way as it was for SVC, the latter needs some more discussions. How can one be compatible with planned infrastructures?

In the area of advanced IPTV terminal (AIT) we had a productive meeting with ITU-T SG16 Q.13. The objectives look very promising and aim to increase the user experience thanks to the usage of latest media coding, transport, and processing technologies. An AhG has been set up to discuss the details. Again, AhGs are open to the public and everyone can join and contribute.

The information exchange with virtual worlds (MPEG-V) has merged with the representation of sensory effects (RoSE) and will be now jointly developed under the umbrella of MPEG-V. However, it was sought that ‘information exchange with virtual worlds’ is no longer appropriate and they’re seeking for a new name. Any inputs are welcome and are discussed within the AhG that is open to everybody. Thus, join this exciting activity now! In its current working draft the following sensory effects are defined which shall create an enhanced, immersive user experience: various light effects, temperature, wind, vibration, water sprayer, perfumer/scent, fog, window blind/shadow, sound, and color correction.

Finally, MPEG got another Emmy, the "Technology and Engineering Emmy Award 2007-2008" for MPEG-4 AVC from the National Academy of Television Arts & Sciences (NATAS). It seems that success within MPEG is from now on measured in terms of number of awards received ;-) And last but not least, if you’ve been ever interested in the MPEG vision, you may find it here.