Multimedia Communication: October 2008

Friday, October 31, 2008

Workshop on the Many Faces of Multimedia Semantics

This workshop was held together with ACM Multimedia and comprised an interesting keynote about semantic video annotation using ontologies. It seems that classical knowledge engineering is approaching the search/metadata/annotation community. Anyway, the talk was very interesting at it shows a way to generate more complex rules (SWRL) from basic rules based on the ontology. That is, annotating multimedia content by applying reasoning techniques on top of ontologies.

During this workshop I learned - among others - about Multimedia OWL (MOWL) that describes spatio-temporal relationships of objects within media including a probabilistic association thereof.

Btw. I also presented our results of the MPEG core experiement related to the semantics of MPEG-21 Digital Items. People agreed that we need an equivalent of an audio/video decoder but for metadata. Current metadata specifications define the syntax (+semantics) only but no means how to "decode" metadata.

ACM Multimedia 2009 will be held in Beijing and 2010 in Florence.

ACM Multimedia 2008: Day 3

The open source track was a great success thanks to its contributors presenting

Network-Integrated Multimedia Middleware (NMM)**
LIRe: Lucene Image Retrieval (An Extensible Java CBIR Library)
GpuCV (An OpenSource GPU-Accelerated Framework for Image Processing and Computer Vision)
An Open Source Software Framework for DVB-* Transmission: OpenCaster and FATCAPS
FOBS: An Open Source Object-Oriented Library for Accessing Multimedia Content

** ... awarded with the ACM Multimedia Open Source Award.

The panel discussion was about multimedia education: can we find unity in diversity? All panelists concluded that there's a lack of good, comprehensive text books. Furthermore, we should also think about a multimedia curriculum which should be multi-disciplinary, i.e., should have modules from computer science, electrical engineering, and "arts" (e.g., production tools). See also here...

Thursday, October 30, 2008

ACM Multimedia 2008: Day 2

At day 2 of this year's ACM Multimedia, a things are worth to mention here.

First, there was an interesting talk from Yahoo! Research about resolving tag ambiguity where they measure ambiguity of photo tags as a difference between distributions and perform some optimizations. User tests show that 20-26 terms are enough to “compute” ambiguity. They've found different ambiguities such as geographic, temporal, etc.

Second, the brave new topics - controversial by definition! - comprised a presentation about social signal processing (SSP). This was first mentioned in the Signal Processing Magazine 2007. One of the (main) open issues is getting psychology and engineering closer (which is already happening by the way). The presentation was coming from a Network of Excellence founded by EC FP7 with 5 years duration! One of the main goals: create THE Web portal for SSP ;-) Basically, the signals are coming from sensors such as microphones and cameras and the aim is to process/analyze the social behavior of users. Honestly, to me, that’s yet another multimedia analysis approach unless the get more sensors involved. But what knows a stranger...

Finally, the social event provided a decent dinner but ended rather early and quickly. We found asylum in a sports bar near our hotel. Our colleagues from Toulouse got the best paper award and Prof. Steinmetz got awarded with the SIGMM award for outstanding technical contributions to multimedia computing, communications and applications. Congratulations!

Tuesday, October 28, 2008

ACM Multimedia 2008: Day 1

This year's ACM Multimedia was opened with a keynote about Internet 3.0, the next generation Internet. However, in this talk topics like self-organization, Quality of Service, and Next Generation Networks (NGNs) haven't been discussed very intensively if at all. Please check yourself whether you'll find it useful or not.

The best paper session included - among others - a presentation of the Flickr distance that calculates a distance metric based on tagged images and extracted features. Very interesting and promising results though! Our colleagues from Toulouse presented an approach to stream 3D contents to heterogeneous devices...

Systems Track: Video Streaming - I was somehow disappointed by the P2P-related presentations as the results are mainly based on simulations without proper discussions thereof. Perhaps I just joined the wrong session.

More to come during the week...

Friday, October 17, 2008

MPEG news: a report from the 85th meeting in Busan, Korea

A lot of interesting things happened here and I'd like to report on three topics:

MPEG RoSE
Advanced IPTV Terminal
High-performance Video Coding

MPEG RoSE: At this meeting we've issued the second version of the WD which will be publicly available and, thus, I can provide a more detailed overview here. I've also updated my slides from the last meeting which now can be found on SlideShare. The aim of RoSE to extend the traditional A/V content consumption to the dimension of sensory effects which are referred to as "an effect to augment perception by stimulating human senses in a particular scene of a multimedia application". With that definition in mind, sensory effects are composed by following the structure of the Sensory Effect Description Language (SEDL) and making use of terms (actually, effect types) of the Sensory Effect Vocabulary (SEV).

Sensory Effect Description Language (SEDL): Provides basic building blocks (declaration, effects, group of effects, reference to effects, parameters), common attributes (timing, priority, intensity, etc.), and data types (void at the moment) for constructing/authoring sensory effect metadata which is associated to A/V content.

Sensory Effect Vocabulary (SEV): Provides a clear set of extensible effect types which are currently comprising effects for color (illumination), temperature (°C), wind (Beaufort), and vibration (Richter).

It is foreseen that the sensory effect metadata which comes along with the A/V content is translated (or mapped or adapted) - by a module (hw/sw) called RoSE engine - to commands that are understood by so-called sensory devices with certain capabilities. Both commands and capabilities are also within the scope of standardization. Hence, it should be possible to consume the A/V content timely synchronized with its effects for an increased user experience. Furthermore, user preferences might also affect this translation/mapping/adaptation process.

Advanced IPTV Terminal: This activity is about to define a terminal suitable for IPTV scenarios jointly with ITU which may result in a similar construction as the JVT for video coding standardization. However, nothing has been fixed yet and interested parties are invited to join the discussions via the corresponding Ad-hoc Group (AhG) [subscribe]with the following mandates:

Look after the process of establishing the joint project with ITU-T so that SG16 may presented with a matured proposal for a new standard for Advanced IPTV Terminal
Gather use case scenarios and requirements for Advanced IPTV Terminal
Conduct collaboration with ITU-T Q.13/16 to prepare a joint meeting

High-performance Video Coding: I've already reported on that in previous blog posts, see part one and two for details.

Finally, MPEG got an Emmy! ;-)

What are the new challenges in video coding standardization? (part two)

The results of the workshop will be available as output document N10174 with the main conclusions as follows:

Larger formats (resolutions) are becoming more and more popular.
Video bitrate - especially uncompressed - is (far) ahead the technologies for economic network transmission for both wired and wireless networks, i.e., bitrate >(>) bandwidth.
Next generation of video compression technology is needed with higher compression capabilities than the AVC standard.

However, there's a need for more evidence that goes beyond the initial optimism where 35% bitrate reduction was reported for 1080p. Therefore, MPEG will adopt its usual process which consists of:

Collecting requirements mainly targeting compression performance and higher resolutions
Call for test material (N10176) seeking for video sequence with VGA, 1080p, or 4Kx2K resolution captured with current camera equipment.
Call for evidence to let the industry (in particular, the national bodies) respond to the idea of developing a High-performance Video Coding (HVC) standard.

I'll provide references to the above mentioned documents as soon as they become available! Interested parties are invited to discuss issues related to HVC as part of an Ad-hoc Group (AhG) with the following mandates:

To further discuss vision, applications and requirements of high-performance video coding.
To distribute the Call for Test Sequences and assist the video chair in collecting and evaluating new test sequences for the upcoming Call for Evidence.
To further discuss and develop conditions and testing methodologies for the Draft Call for Evidence.

Reference:

Part one

MPEG got an Emmy!

At this week's MPEG meeting, MPEG celebrated the JVT receipt of the 2008 Primetime Emmy Engineering Award. They've received it for its work on the Advanced Video Coding (AVC) High Profile. The actual awards - three of them - are now hosted at the three main standardization bodies who worked on AVC, namely IEC, ISO, and ITU. However, somehow Gary Sullivan managed to get an additional Emmy statue for MPEG and presented it to Leonardo Chiariglione (MPEG convenor). The MPEG Emmy statue will travel from national body to national body based on MPEG's meeting schedule. That is, the next chance to see the Emmy statue live will be in Lausanne, CH at the 87th MPEG meeting.

During the Friday plenary I also got the chance to take a picture of/with the Emmy statue ;-) Another picture can be found here.

Tuesday, October 14, 2008

What are the new challenges in video coding standardization? (part one)

At today's workshop on new challenges in video coding standardization, MPEG tried to figure out what's beyond AVC, SVC, and MVC. Even before the workshop started, a new acronym was found for this envisaged standard: HVC which stands for High-Performance Video Coding. In my view this already provides a rough direction where MPEG is heading to, i.e., higher resolutions (beyond 4K a.k.a. UD which is four times full HD) and higher framerates (beyond 120Hz up to 180/200/240Hz). Interestingly, displays that can handle such resolutions and framerate will be available around 2012 which would require to start video coding standardization for that right now. Another issue that needs to be considered is the viewing angle with such huge display sizes. Currently, the distance between display and viewer is about three times the display height which results in a viewing angle of 33° using a 40'' TV set. With a 80'' device the viewing angle increases to 63° that needs to be considered both by the codec and the display, of course. However, it is not clear which kind of application (domain) MPEG is targeting for this possible future video coding standard.

Of course, there's much to say bout that and I'll report on that later during the week.

Finally, I also learned another acronym (note: MPEG is full of acronyms) which is ODS that stands for Other Digital Stuff.

Thursday, October 2, 2008

Busan Workshop on New Challenges in Video Coding Standardization – Program

Video compression has been a very active area of defining standards over the last 30 years. To face the challenges that emerging applications impose on the requirements of video coding standardization, ISO/IEC WG11 (MPEG) will hold a full-day workshop on 14 October 2008, during the 86th WG11 meeting in Korea.

The key intention of the workshop is to acquire solid information about the context in which video coding will be operating in the future, which will enable MPEG to draw conclusions for the needs and chances in video coding standardization during the next years and to start drafting three key documents: technology context, applications and requirements for a new High-Performance Video Coding (HVC) standard. For this purpose speakers have been invited on key topics for the morning sessions, and in addition regular proposed contributions were accepted for the noon and afternoon sessions.

The Workshop will be held on 14 October 2008 from 9:00-18:00 at Crystal Ballroom #3, 3rd Floor, Busan Lotte Hotel, 503-15 Bujeon-Dong Pusanjin-Gu Busan, Korea 614-030.

Detailed Program

9:00-9:10 Welcome and Introduction (Leonardo Chiariglione)

Invited Session 1: Video Coding and Next-Generation Networks
(Chair: Jens-Rainer Ohm)

9:10-9:40 Tomonori Aoyama (Keio University):
Direction of digital media and content evolution and a new generation network to support it

9:40-10:10 Jeongyeon Lim (SK Telecom), Simon Ji (LG Electronics), Taesung Park and Daesung Cho (Samsung Electronics), Jae-Seob Shin (Pixtree) :
Experiences and forecasts on mobile video services by manufacturers and operators

10:10-10:40 Doug Y. Suh (KHU), Won Ryu and Jeong Joo Yoo (ETRI):
MPEG-64 (MPEG over IPv6 and 4G networks)

10:40-11:00 Coffee Break

Invited Session 2: Video Coding for Future Applications and Devices
(Chair: Jörn Ostermann)

11:00-11:30 Seonki Kim (Samsung):
Advanced Technology in LCD Display – New Driving Scheme and Advanced Super PVA Technology

11:30-12:00 Jonghwa Kim (Samsung):
Flash Memory for Packaged Media : What it can do and where it fits

Regular Session 1: Technology Context of Future Video Coding
(Chair: Ajay Luthra)

12:00-12:30 Euee S. Jang (Hanyang University):
Reconfigurable Video Coding – A Building Block for Future MPEG Coding Standards

12:30-12:50 Kim Kyunghoon, Kim Nacwoo, Kim Sangkyune, Son Seungchul and Lee Byungtak (ETRI):
The necessity of a New MPEG Standard Supporting Real-time Distributed IPTV Environment

12:50-14:10 Lunch Break

Regular Session 2: Compression Technology
(Chair: T.K. Tan)

14:10-14:30 Geert Van der Auwera and Yeong Taeg Kim (Samsung Information Systems):
Triangular Sub-Macroblock Partitioning for Motion Compensated Prediction

14:30-14:50 Munchurl Kim (ICU), Changseob Park (KBS):
Beyond Macroblock based Predictive Coding

14:50-15:10 Kyohyuk Lee, Elena Alshina, Jeonghoon Park, Woojin Han and Junghye Min (Samsung):
Technical considerations on new challenges in video coding standardization

15:10-15:30 Johannes Ballé, Steffen Kamp, Aleksandar Stojanovic, Mathias Wien and Jens-Rainer Ohm (RWTH Aachen University):
Tools for Improving Texture and Motion Compression

15:30-16:00 Coffee Break

16:00 Open Discussion and Conclusions

18:00 End of Workshop

Participants who are not regularly attending the 86th MPEG meeting should register by sending an email to Sungwook Jung ( swjung@kisi.or.kr ) with subject line 'Registration for Busan video coding workshop' and including contact data in the mail body (name/title, company/affiliation,
address/phone/fax/email).

CfP: International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) 2009

May 6-8 2009, London, UK

Call for Papers

The International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) is one of the main international fora for the presentation and discussion of the latest technological advances in interactive multimedia services. The objective of the workshop is to bring together researchers and developers from academia and industry working in all areas of image, video and audio applications, with a special focus on analysis.

Topics of interest include, but are not limited to:

Multimedia content analysis and understanding
Content-based browsing, indexing and retrieval of images, video and audio
2D/3D feature extraction
Advanced descriptors and similarity metrics for audio and video
Relevance feedback and learning systems
Segmentation of objects in 2D/3D image sequences
Motion analysis and tracking
Video analysis and event recognition
Analysis for coding efficiency and increased error resilience
Analysis and tools for content adaptation
Multimedia content adaptation tools, transcoding and transmoding
Content summarization and personalization strategies
End-to-end quality of service support for Universal Multimedia Access
Semantic mapping and ontologies
Multimedia analysis for new and emerging applications
Multimedia analysis hardware and middleware
Semantic web and social networks
Advanced interfaces for content analysis and relevance feedback

Paper Submission

The intention is to publish the proceedings in the Springer's Lecture Notes in Computer Science Series and to make them available in IEEExplore. The authors are requested to send their submissions (4 pages double column in English). All submissions will be peer reviewed by at least three members of the technical program committee.

Important Dates:

Proposal for Special Session: November 21, 2008
Paper Submission: December 1, 2008
Notification of Acceptance: January 16, 2009
Camera-ready Papers: February 06, 2009

Pages