Showing posts with label compact descriptors for visual search. Show all posts
Showing posts with label compact descriptors for visual search. Show all posts

Friday, April 25, 2014

MPEG news: a report from the 108th meeting, Valencia, Spain

This blog post is also available at bitmovin tech blog and SIGMM records.

The 108th MPEG meeting was held at the Palacio de Congresos de Valencia in Spain featuring the following highlights (no worries about the acronyms, this is on purpose and they will be further explained below):
  • Requirements: PSAF, SCC, CDVA
  • Systems: M2TS, MPAF, Green Metadata
  • Video: CDVS, WVC, VCB
  • JCT-VC: SHVC, SCC
  • JCT-3D: MV/3D-HEVC, 3D-AVC
  • Audio: 3D audio 
Opening Plenary of the 108th MPEG meeting in Valencia, Spain.
The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

As indicated above, MPEG is full of (new) acronyms and in order to become familiar with those, I’ve put them deliberately in the overview but I will explain them further below.

PSAF – Publish/Subscribe Application Format

Publish/subscribe corresponds to a new network paradigm related to content-centric networking (or information-centric networking) where the content is addressed by its name rather than location. An application format within MPEG typically defines a combination of existing MPEG tools jointly addressing the needs for a given application domain, in this case, the publish/subscribe paradigm. The current requirements and a preliminary working draft are publicly available.

SCC – Screen Content Coding

I’ve introduced this topic in my previous report and this meeting the responses to the CfP have been evaluated. In total, seven responses have been received which meet all requirements and, thus, the actual standardization work is transferred to JCT-VC. Interestingly, the results of the CfP are publicly available. Within JCT-VC, a first test model has been defined and core experiments have been established. I will report more on this as an output of the next meetings…

CDVA – Compact Descriptors for Video Analysis

This project has been renamed from compact descriptors for video search to compact descriptors for video analysis and comprises a publicly available vision statement. That is, interested parties are welcome to join this new activity within MPEG.

M2TS – MPEG-2 Transport Stream

At this meeting, various extensions to M2TS have been defined such as transport of multi-view video coding depth information and extensions to HEVC, delivery of timeline for external data as well as carriage of layered HEVC, green metadata, and 3D audio. Hence, M2TS is still very active and multiple amendments are developed in parallel.

MPAF – Multimedia Preservation Application Format

The committee draft for MPAF has been approved and, in this context, MPEG-7 is extended with additional description schemes.

Green Metadata

Well, this standard does not have its own acronym; it’s simply referred to as MPEG-GREEN. The draft international standard has been approved and national bodies will vote on it at the JTC 1 level. It basically defines metadata to allow clients operating in an energy-efficient way. It comes along with amendments to M2TS and ISOBMFF that enable the carriage and storage of this metadata.

CDVS – Compact Descriptors for Visual Search

CDVS is at DIS stage and provide improvements on global descriptors as well as non-normative improvements of key-point detection and matching in terms of speedup and memory consumption. As all standards at DIS stage, national bodies will vote on it at the JTC 1 level. 

What’s new in the video/audio-coding domain?
  • WVC – Web Video Coding: This project reached final draft international standard with the goal to provide a video-coding standard for Web applications. It basically defines a profile of the MPEG-AVC standard including those tools not encumbered by patents.
  • VCB – Video Coding for Browsers: The committee draft for part 31 of MPEG-4 defines video coding for browsers and basically defines VP8 as an international standard. This is explains also the difference to WVC.
  • SHVC – Scalable HEVC extensions: As for SVC, SHVC will be defined as an amendment to HEVC providing the same functionality as SVC, scalable video coding functionality.
  • MV/3D-HEVC, 3D-AVC: These are multi-view and 3D extensions for the HEVC and AVC standards respectively.
  • 3D Audio: Also, no acronym for this standard although I would prefer 3DA. However, CD has been approved at this meeting and the plan is to have DIS at the next meeting. At the same time, the carriage and storage of 3DA is being defined in M2TS and ISOBMFF respectively. 
Finally, what’s new in the media transport area, specifically DASH and MMT?

As interested readers know from my previous reports, DASH 2nd edition has been approved has been approved some time ago. In the meantime, a first amendment to the 2nd edition is at draft amendment state including additional profiles (mainly adding xlink support) and time synchronization. A second amendment goes to the first ballot stage referred to as proposed draft amendment and defines spatial relationship description, generalized URL parameters, and other extensions. Eventually, these two amendments will be integrated in the 2nd edition which will become the MPEG-DASH 3rd edition. Also a corrigenda on the 2nd edition is currently under ballot and new contributions are still coming in, i.e., there is still a lot of interest in DASH. For your information – there will be two DASH-related sessions at Streaming Forum 2014.

On the other hand, MMT’s amendment 1 is currently under ballot and amendment 2 defines header compression and cross-layer interface. The latter has been progressed to a study document which will be further discussed at the next meeting. Interestingly, there will be a MMT developer’s day at the 109th MPEG meeting as in Japan, 4K/8K UHDTV services will be launched based on MMT specifications and in Korea and China, implementation of MMT is now under way. The developer’s day will be on July 5th (Saturday), 2014, 10:00 – 17:00 at the Sapporo Convention Center. Therefore, if you don’t know anything about MMT, the developer’s day is certainly a place to be.

Contact:

Dr. Christian Timmerer
CIO bitmovin GmbH | christian.timmerer@bitmovin.net
Alpen-Adria-Universität Klagenfurt | christian.timmerer@aau.at

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):
  • Text of ISO/IEC 13818-1:2013 PDAM 7 Carriage of Layered HEVC (14/05/02) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of Green Metadata (14/04/04) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of 3D Audio (14/04/04) 
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of additional audio profiles & levels (14/04/04) 
  • Text of ISO/IEC 14496-12:2012 PDAM 4 Enhanced audio support (14/04/04) 
  • TuC on sample variants, signatures and other improvements for the ISOBMFF (14/04/04) 
  • Text of ISO/IEC CD 14496-22 3rd edition (14/04/04) 
  • Text of ISO/IEC CD 14496-31 Video Coding for Browsers (14/04/11) 
  • Text of ISO/IEC 15938-5:2005 PDAM 5 Multiple text encodings, extended classification metadata (14/04/04) 
  • WD 2 of ISO/IEC 15938-6:201X (2nd edition) (14/05/09) 
  • Text of ISO/IEC DIS 15938-13 Compact Descriptors for Visual Search (14/04/18) 
  • Test Model 10: Compact Descriptors for Visual Search (14/05/02) 
  • WD of ARAF 2nd Edition (14/04/18) 
  • Use cases for ARAF 2nd Edition (14/04/18) 
  • WD 5.0 MAR Reference Model (14/04/18) 
  • Logistic information for the 5th JAhG MAR meeting (14/04/04) 
  • Text of ISO/IEC CD 23000-15 Multimedia Preservation Application Format (14/04/18) 
  • WD of Implementation Guideline of MP-AF (14/04/04) 
  • Requirements for Publish/Subscribe Application Format (PSAF) (14/04/04) 
  • Preliminary WD of Publish/Subscribe Application Format (14/04/04) 
  • WD2 of ISO/IEC 23001-4:201X/Amd.1 Parser Instantiation from BSD (14/04/11) 
  • Text of ISO/IEC 23001-8:2013/DCOR1 (14/04/18) 
  • Text of ISO/IEC DIS 23001-11 Green Metadata (14/04/25) 
  • Study Text of ISO/IEC 23002-4:201x/DAM2 FU and FN descriptions for HEVC (14/04/04) 
  • Text of ISO/IEC 23003-4 CD, Dynamic Range Control (14/04/11) 
  • MMT Developers’ Day in 109th MPEG meeting (14/04/04) 
  • Results of CfP on Screen Content Coding Tools for HEVC (14/04/30) 
  • Study Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/06/06) 
  • HEVC RExt Test Model 7 (14/06/06) 
  • Scalable HEVC (SHVC) Test Model 6 (SHM 6) (14/06/06) 
  • Report on HEVC compression performance verification testing (14/04/25) 
  • HEVC Screen Content Coding Test Model 1 (SCM 1) (14/04/25) 
  • Study Text of ISO/IEC 23008-2:2013/PDAM4 3D Video Extensions (14/05/15) 
  • Test Model 8 of 3D-HEVC and MV-HEVC (14/05/15) 
  • Text of ISO/IEC 23008-3/CD, 3D audio (14/04/11) 
  • Listening Test Logistics for 3D Audio Phase 2 (14/04/04) 
  • Active Downmix Control (14/04/04) 
  • Text of ISO/IEC PDTR 23008-13 Implementation Guidelines for MPEG Media Transport (14/05/02) 
  • Text of ISO/IEC 23009-1 2nd edition DAM 1 Extended Profiles and availability time synchronization (14/04/18) 
  • Text of ISO/IEC 23009-1 2nd edition PDAM 2 Spatial Relationship Description, Generalized URL parameters and other extensions (14/04/18) 
  • Text of ISO/IEC PDTR 23009-3 2nd edition DASH Implementation Guidelines (14/04/18) 
  • MPEG vision for Compact Descriptors for Video Analysis (CDVA) (14/04/04) 
  • Plan of FTV Seminar at 109th MPEG Meeting (14/04/04) 
  • Draft Requirements and Explorations for HDR /WCG Content Distribution and Storage (14/04/04) 
  • Working Draft 2 of Internet Video Coding (IVC) (14/04/18) 
  • Internet Video Coding Test Model (ITM) v 9.0 (14/04/18) 
  • Uniform Timeline Alignment (14/04/18) 
  • Plan of Seminar on Hybrid Delivery at the 110th MPEG Meeting (14/04/04) 
  • WD 2 of MPEG User Description (14/04/04)

Thursday, December 8, 2011

MPEG news: a report from the 98th meeting, Geneva, Switzerland

#MPEG98 report: DASH=IS ✔ CDVS=CfP eval ✔ {MMT, HEVC, 3DAudio}=MPEG-H ✔ IVC={IVC, WebVC} ✔ 3DVC=CfP eval ✔

... MPEG news from its 98th meeting in Geneva, Switzerland with less than 140 characters and a lot of acronyms. The official press release is, as usual, here. As you can see from the press release, MPEG produced significant results, namely:
  • MPEG Dynamic Adaptive Streaming over HTTP (DASH) ratified
  • 3D Video Coding: Evaluation of responses to Call for Proposals
  • MPEG royalty free video coding: Internet Video Coding (IVC) + Web Video Coding (WebVC)
  • High Efficiency Coding and Media Delivery in Heterogeneous Environments: MPEG-H comprising MMT, HEVC, 3DAC
  • Compact Descriptors for Visual Search (CDVS): Evaluation of responses to the Call for Proposals
  • Call for requirements: Multimedia Preservation Description Information (MPDI)
  • MPEG Augmented Reality (AR)
As you can see, a long list of achievements within a single meeting but let's dig inside. For each topic I've also tried to provide some research issues which I think are worth to investigate both inside and outside MPEG. 

MPEG Dynamic Adaptive Streaming over HTTP (DASH): DASH=IS ✔

As the official press release states, the MPEG ratifies its draft standard for DASH and it comes better, the standard should become publicly available which I expect to happen somewhat early next year, approx. March 2012, or maybe earlier. I say "should" because there is no guarantee that this will actually happen but signs are good. In the meantime, feel free using our software to play around and we expect to update it to the latest version of the standard as soon as possible. Finally, IEEE Computer Society Computing Now has put together a theme on Video for the Universal Web featuring DASH.

Research issues: performance, bandwidth estimation, request scheduling (aka adaptation logic), and Quality of Service/Experience.

3D Video Coding: 3DVC=CfP eval ✔

MPEG evaluated more than 20 proposals submitted as a response to the call issued back in April 2011. The evaluation of the proposal comprised subjective quality assessments conducted by 13 highly qualified test laboratories distributed around the world and coordinated by the COST Action IC1003 QUALINET. The report of the subjective test results from the call for proposals on 3D video coding will be available by end of this week. MPEG documented the standardization tracks considered in 3DVC (i.e., compatible with MVC, AVC base-view, HEVC, ...) and agreed on a common software based on the best-performing proposals.

Research issues: encoding efficiency of 3D depth maps and compatibility for the various target formats (AVC, MVC, HEVC) as well as depth map estimation at the client side.

MPEG royalty free video coding: IVC vs. WebVC

In addition to the evaluation of the responses to the call for 3DVC, MPEG also evaluated the responses to the Internet Video Coding call. Based on the responses, MPEG decided to follow up with two approaches namely Internet Video Coding (IVC) and Web Video Coding (WebVC). The former - IVC - is based on MPEG-1 technology which is assumed to be royalty-free. However, it requires some performance boosts in order to make it ready for the Internet. MPEG's approach is a common platform called Internet video coding Test Model (ITM) which serves as the basis for further improvements. The latter - WebVC - is based on the AVC constrained baseline profile which performance is well-known and satisfactory but, unfortunately, it is not clear which patents of the AVC patent pool apply to this profile. Hence, a working draft (WD) of WebVC will be provided (also publicly available) in order to get patent statements from companies. The WD will be publicly available by December 19th.

Further information:
Research issues: coding efficiency with using only royalty free coding tools whereby the optimization is first towards royalties and then efficiency.

MPEG-H

A new star is born which is called MPEG-H referred to as "High Efficiency Coding and Media Delivery in Heterogeneous Environments" comprising three parts: Pt. 1 MMT, Pt. 2 HEVC, Pt. 3 3D Audio. There's a document called context and objective of MPEG-H but I can't find out whether it's public (I come back later on this).

Part 1: MMT (MPEG Media Transport) is progressing (slowly) but a next step should be definitely to check the relationship of MMT and DASH for which an Ad-hoc Group has been established (N12395), subscribe here, if interested.
Research issues: very general at the moment, what is the best delivery method (incl. formats) for future multimedia applications? Answer: It depends, ... ;-)

Part 2: HEVC (High-Efficiency Video Coding) made significant progress at the last meeting, in particular: only one entropy coder (note: AVC has two, CABAC and CAVLC which are supported in different profiles), 8 bit decoding (could be also 10 bit, probably done in some profiles), specific integer transform, stabilized and more complete high-level syntax and HRD description (i.e., reference picture buffering, tiles, slices, and wavefronts enabling parallel decoding process). Finally, a prototype has been demonstrated decoding HEVC in software on an iPad 2 at WVGA resolution and the 10min Big Buck Bunny sequence at SD resolution with avg. 800 kbit/s which clearly outperformed the corresponding AVC versions.
Research issues: well, coding efficiency, what else? The ultimative goal to have a performance gain of more than 50% compared to the predecessor which is AVC.

Part 3: 3D Audio Coding (3DAC) is in its early stages but there will be an event during San Jose meeting which will be announced here. As of now, use cases are provided (home theatre, personal TV, smartphone TV, multichannel TV) as well as candidate requirements and evaluation methods. One important aspect seems to be user experience for highly immersive audio (i.e., 22.2, 10.2, 5.1) including bitstream adaptation for low-bandwidth and low-complexity.
Research issues: sorry, I'm not really an audio guy but I assume it's coding efficiency, specifically for 22.2 channels ;-)

Compact Descriptors for Visual Search (CDVS)

For CDVS, responses to the call for proposals (from 10 companies/institutions) have been evaluated and a test model has been established based on the best performing proposals. The next steps include the improvement of the test model towards for inclusion in the MPEG-7 standard.
Research issues: descriptor efficiency for the intended application as well as precision on the information retrieval results.

Multimedia Preservation Description Information (MPDI)

The aim of this new work item is to provide "standard technology helping users to preserve digital multimedia that is used in many different domains, including cultural heritage, scientific research, engineering, education and training, entertainment, and fine arts for long-term across system, organizational, administrative and generational boundaries". It comes along with two public documents, the current requirements and a call for requirements which are due at the 100th MPEG meeting in April 2002.
Research issues: What and how to preserve digital multimedia information?

Augmented Reality (AR)

MPEG's newest project is on Augmented Reality (AR), starting with an application format for which a working draft exists. Furthermore, draft requirements and use cases are available. These three documents will be available on Dec 31st.
Research issues: N/A

Finally, I hope now you can better understand what I've put at the beginning with all these acronyms ...

#MPEG98 report: DASH=IS ✔ CDVS=CfP eval ✔ {MMT, HEVC, 3DAudio}=MPEG-H ✔ IVC={IVC, WebVC} ✔ 3DVC=CfP eval ✔

Wednesday, July 27, 2011

MPEG news: a report from the 97th meeting, Torino, Italy

The 97th MPEG meeting in Torino brought a few interesting news which I'd like to report here briefly. Of course, as usual, there is the official press release, however, I'd like to report on some interesting topics as follows:
  • MPEG Unified Speech and Audio Coding (USAC) reached FDIS status
  • Call for Proposals: Compact Descriptors for Visual Search (CDVS)
  • Call for Proposals: Internet Video Coding (IVC)
  • DIS on MPEG Dynamic Adaptive Streaming over HTTP (DASH)
MPEG Unified Speech and Audio Coding (USAC) reached FDIS status

ISO/IEC 23003-3 aka Unified Speech and Audio Coding (USAC) reached FDIS status and soon will be an International Standard. The FDIS itself won't be publicly available but the Unified Speech and Audio Coding Verification Test Report in September 2011 (most likely here). 

Call for Proposals: Compact Descriptors for Visual Search (CDVS)

I reported previously about that and here comes the final CfP including the evaluation framework.

MPEG is planning standardizing technologies that will enable efficient and interoperable design of visual search applications. In particular we are seeking technologies for visual content matching in images or video. Visual content matching includes matching of views of objects, landmarks, and printed documents that is robust to partial occlusions as well as changes in vantage point, camera parameters, and lighting conditions.

There are a number of component technologies that are useful for visual search, including format of visual descriptors, descriptor extraction process, as well as indexing, and matching algorithms. As a minimum, the format of descriptors as well as parts of their extraction process should be defined to ensure interoperability.

It is envisioned that a standard for compact descriptors will:
  • ensure interoperability of visual search applications and databases, 
  • enable high level of performance of implementations conformant to the standard,
  • simplify design of descriptor extraction and matching for visual search applications, 
  • enable hardware support for descriptor extraction and matching in mobile devices,
  • reduce load on wireless networks carrying visual search-related information.
It is envisioned that such standard will provide a complementary tool to the suite of existing MPEG standards, such as MPEG-7 Visual Descriptors. To build full visual search application this standard may be used jointly with other existing standards, such as MPEG Query Format, HTTP, XML, JPEG, JPSec, and JPSearch.

The Call for Proposals and the Evaluation Framework is publicly available. From a research perspective, it would be interesting to see how technologies submitted as an answer to the CfP compete with existing approaches and applications/services.

In this context, it is probably worth looking at IEEE Multimedia Jul.-Sep. 2011 issue which is dedicated to visual content: identification and search including an overview about this new MPEG standard.

Call for Proposals: Internet Video Coding (IVC)

I reported previously about that and the final CfP for Internet Video Coding Technologies is available here. The requirements reveal some interesting issues the call is about:
  • Real-time communications, video chat, video conferencing,
  • Mobile streaming, broadcast and communications,
  • Mobile devices and Internet connected embedded devices 
  • Internet broadcast streaming, downloads
  • Content sharing.
Requirements fall into the following major categories:
  • IPR requirements
  • Technical requirements
  • Implementation complexity requirements 
Clearly, this work item has an optimization towards IPR but others are not excluded. In particular,
It is anticipated that any patent declaration associated with the Baseline Profile of this standard will indicate that the patent owner is prepared to grant a free of charge license to an unrestricted number of applicants on a worldwide, non-discriminatory basis and under other reasonable terms and conditions to make, use, and sell implementations of the Baseline Profile of this standard in accordance with the ITU-T/ITU-R/ISO/IEC Common Patent Policy. 

MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH)

For all DASH enthusiast, the latest - and probably almost final - version of DASH-related standards can be found here. Please note that DASH has been reorganized into MPEG-DASH referred to as ISO/IEC DIS 23009-1.2, Part 1: Media presentation description and segment formats. Additionally, you might be interested in the following draft:
  • ISO/IEC 14496-12:2008/DAM 3, Part 12: ISO base media file format, AMENDMENT 3: DASH support and RTP reception hint track processing
  • ISO/IEC FDIS 23001-7, Part 7: Common encryption format for ISO base media file format
All these DASH-related documents are publicly available here. In terms of implementation, the interested reader might check out the ITEC-DASH VLC-based implementation and GPAC (which provides basic support for DASH) respectively.

Further information you may find at the MPEG Web site, specifically under the hot news section and the press release. Working documents of any MPEG standard so far can be found here. If you want to join any of these activities, the list of Ad-hoc Groups (AhG) is available here (soon also here) including the information how to join their reflectors.