Wednesday, March 10, 2010

ACM Multimedia Grand Challenge 2010: Content Adaptation

One of the ACM Multimedia Grand Challenge 2010 is about content adaptation, probably one of THE tools providing Universal (Multi-)Media Access (UMA). In particular, it is called "Radvision Challenge 2010: Real-time Data Collaboration Adaptation for Multi-Device Video Conferencing" and details can be found here with the input/output described as follows:
Input for this challenge is a video capture of a free-hand drawing (see example video) in XGA.
Output for this challenge should be a set of “adapted” videos , with the same content in different (smaller) resolutions – for instance, VGA and QVGA. The adapted videos would ideally be regarded by users as perceptually optimal, meaning they hold the same content as the original.
The metrics for evaluation are "defined" as follows:
The following criteria could be used, as well as other evaluation metrics that you may devise:
  1. Subjective comparison between the perceptual quality of the original and the “adapted” content.
  2. Subjective comparison between the perceptual quality of a scaled-down version of the original (using a 5-tap poly-phase filter) and the “adapted” content.
  3. Real-time Performance.
However, there are many possibilities to adapt content and evaluate the result which heavily depends on the user's context. Some people may think that the description of the input/output as well as the evaluation criteria is defined too vague and I tend to agree. Let me explain:
  1. The input is given and for the output it is requested to produce "a set of adapted videos" that is "regarded by users as perceptually optimal". However, it's not clear to which context the input shall be adapted. The text says "different (smaller) resolutions" as an example but I can imaging that users will prefer the original video and regard it as perceptually optimal compared to anything else. Thus, in my view it is necessary to specify the context to which the video shall be adapted. The context may include a lot of things such as terminal device, decoding capabilities, network conditions, user location (stationary, mobile), etc., etc.
  2. Subjective quality assessment is not an easy task and there are many possibilities and many approaches. Form the description above it is not clear how the subjective quality evaluation will be performed. In particular, I wonder whether "real" subjective tests as suggested by the Video Quality Experts Group (VQEG) will be adopted (e.g., DSIS - Double Stimulus Impairment Scale or ACR - Absolute Category Scale to just name two). In my view and in order to provide a fair evaluation it is absolutely necessary to define the exact procedure on how the subjective evaluation of the submissions will be performed. One possibility, of course, is the adoption of a standardized approach and probably DSIS is the right candidate.

1 comment:

Unknown said...

Christian,

First of all, thank you for the time you took to read the challenge and comment.

Regarding the issues you raised, I commented in length in the original challenge page: http://comminfo.rutgers.edu/conferences/mmchallenge/2010/02/10/radvision-challenge-adaptation/comment-page-1/#comment-1844

Best Regards,
Sagee.