IEEE VCIP 2021 Special Sessions
Submission of Papers for Regular, Demo, and Special Sessions (extended): June 27, 2021
Paper Acceptance Notification: August 30, 2021
Title: Learning-based Image and Video Coding
Organizers: João Ascenso (Instituto Superior Técnico), Elena Alshina (Huawei)
Description: Image and video coding algorithms create compact representations of an image by exploiting its spatial redundancy and perceptual irrelevance, thus exploiting the characteristics of the human visual system. Recently, data-driven algorithms such as neural networks have attracted a lot of attention and become a popular area of research and development. This interest is driven by several factors, such as recent advances in processing power (cheap and powerful hardware), the availability of large data sets (big data), and several algorithmic and architectural advances (e.g. generative adversarial networks).
Nowadays, neural networks are the state-of-the-art for several computer vision tasks, such as those requiring a high-level understanding of image semantics, e.g. image classification, object segmentation, saliency detection, but also low-level image processing tasks, such as image denoising, inpainting, and super-resolution. These advances have led to an increased interest in applying deep neural networks to image and video coding, which is now the main focus of the JPEG AI and the JVET NN activities within the JPEG and MPEG standardization committees.
The aim of these novel image and video coding solutions is to design a compact representation model that has been obtained (learned) from a large amount of visual data and can efficiently represent the wide variety of images and videos that are consumed today. Some of the available learning-based image coding solutions already show very promising experimental results in terms of rate-distortion (RD) performance, notably in comparison with conventional standard image codecs (especially HEVC Intra and VVC Intra) which code the image data with hand-crafted transforms, entropy coding, and quantization schemes.
This special session on Learning-based Image and Video Coding gathers technical contributions that demonstrate the efficient coding of image and video content based on a learning-based approach. This topic has received many contributions in recent years and is considered critical for the future of both image and video coding, especially solutions adopting end-to-end training as well as for solutions where learning-based tools replace previous conventional tools.
Organizers: Joongki Park (ETRI, South Korea), Elena Stoykova (Bulgarian Academy of Sciences)
Description: Holography is a technique that can record and reconstruct the full nature of light including intensity and phase information. Thanks to this special feature of holography, it had been spotlighted and stimulate further novel applications. The hologram was purely analog at the beginning of the research because everything had to be recorded and reconstructed on photographic films. From the late 1980s, however, the emergence of the display panel, image sensors, and increasingly powerful computer technologies make it possible to digitally capture, process, and visualize hologram. In this transition from analog to digital holography, the importance of digital holographic image signal processing has emerged. The techniques of digital holographic image signal processing have numerous applications in capture/create, transform/synthesis/compression, display/visualize digital hologram.
The aim of this special session is to present topics such as natural light holographic camera, realistic CGH, holographic metrology, over 100K resolution CGH generation by GPU, etc. Topic areas include recording/synthesis of digital holograms, computer-generated holography (CGH), denoising of the digital hologram, novel transforms and compression coding technologies, modeling perceptual visual quality, novel holographic processes on holographic metrologies, compressive holography, and holographic image processing with various light sources including coherent to incoherent, and so on.
Organizers: Marc ANTONINI (University Côte d’Azur and CNRS), Melpomeni DIMOPOULOU (University Côte d’Azur and CNRS)
Description: Today, we live in an increasingly digital society. Data storage is the foundation that drives our society from enabling data-driven decisions based on machine learning, to preserving our collective knowledge for posterity. Most of the world's data is stored on magnetic media, like Hard Disk Drives (HDD) and tape, or optical media. Unfortunately, there has been an industry-wide consensus that all current storage technologies suffer from several fundamental density and durability limitations that seriously question our ability to even store, much less process, all of the world's data in the near future.
The vision to which research contributes is the replacement of today's storage media with a bio-inspired, radically new alternative - synthetic DNA. Interestingly enough, recent works have proven that storing digital data into DNA is not only feasible but also very promising as the DNA’s biological properties allow the storage of a great amount of information into an extraordinary small volume for centuries or even longer with no loss of information.
This special session aims to present a panel of worldwide research that is carried out in the context of the storage of digital information into DNA.
Organizers: Frederik Temmermans (Vrije Universiteit Brussel, Belgium), Deepayan Bhowmik, (University of Stirling, United Kingdom)
Description: Recent advances in artificial intelligence, especially deep learning, for media manipulation enables users to produce near realistic media content that is almost indistinguishable from authentic content to the human eye. These developments open a multitude of opportunities, from creative content production, art industry, digital restoration to image and video coding. However, they also lead to the risk of the spread of manipulated media such as deepfakes which often leads to copyright infringements, social unrest, spread of rumors for political gain, or encouraging hate crimes. While the term Fake Media is often associated with the latter, we consider it in the context of media modification covering both good and bad usage scenarios.
This special session aims to solicit papers addressing the current advances in the fake media domain. This includes, but not limited to, the use of machine learning / artificial intelligence in
- creative content production
- media restoration
- image and video coding
- media privacy and security
- media manipulation (including deepfake) detection
- media integrity, authenticity, and provenance
- initiatives to standardisation
Organizers: Azeddine Beghdadi (Université Sorbonne Paris Nord, France), Lian Xu (University of Western Australia), Mohammed Bennamoun (University of Western Australia), Sid Ahmed Fezza (INTTIC, Oran, Algeria)
Description: Visual data classification has seen an increase in research activity over the last decade. In particular, there has been a renewed interest in artificial intelligence-based approaches. This is inseparable from the remarkable development of visual data acquisition technologies of different modalities, and the notable advances in high-performance computing systems and data processing architectures. However, despite that the state-of-the-art image classification approaches have shown promising results on natural scene databases such as ImageNet and MS COCO, their classification performance is quite limited when applied to specific databases, such as in the medical, oceanographic, geological, or other fields of applied research. There are different challenges related to the modality and nature of images in these fields. It is thus required to analyze and classify these atypical visual databases. These databases provide us with an opportunity to test the generalizability of the existing deep neural network architectures and further address their limitations. The goal of this special session is to open the space for confronting ideas and results of deep learning-based image classification methods for challenging databases in various application fields. Another objective is to investigate semantic segmentation methods for images with uncertain object structures, such as ambiguous boundaries.
This special session aims to provide an overview and to advance scientific research of deep learning methods in a wide range of applications. Topics of interest include but are not limited to:
- Deep image classification for medical diagnosis
- Image/video distortion analysis and classification
- Adversarial attacks for deep learning classification: detection and defense
- Text document image classification
- Semantic image segmentation of underwater images
- Deep video distortion identification and classification
- Deep classification of underwater objects and marine species
- Deep learning for seismic data classification
- Benchmark dataset dedicated to image/video classification
- Deep quality assessment and enhancement of visual information
- Abnormal events detection and classification
- Human action classification
Organizers: Mahmoud Reza Hashemi (University of Tehran, Iran), Shervin Shirmohammadi (University of Ottawa, Canada), Farhad Pakdaman (University of Mazandaran, Iran), Maryam Amiri (Ciena Corporation, Canada)
Description: Multimedia systems are often associated with challenging tasks that require heavy computations, bulky data, high bitrates, and challenging timing requirements. The traditional local or cloud-based processing paradigms face difficulty satisfying these contradictory requirements. The emerging field of collaborative processing, on the other hand, takes advantage of various processing platforms, such as low-delay edge servers, high-capacity cloud infrastructure, and task-specific local devices, to satisfy the required constraints. Processing and communication tasks in such a system are distributed among various platforms in a way to optimize the usage of limited resources and meet the service requirements. Collaborative processing has already shown promising results in low delay video streaming, managing the QoE of cloud gaming, and optimizing the energy usage of visual analysis tasks, especially for machine learning (ML)- based models. With the rise of ML and Artificial Intelligence (AI), and also the increasing deployment of edge servers and real-world AI/ML-based IoT solutions for multimedia and vision applications at the edge, collaborative edge computing seems to be the natural solution for future multimedia systems. The advent of 5G has made this approach even more viable. Hence, we are experiencing an increased effort in both academia and industry to address some of its existing challenges such as the large communication overhead between nodes, task-driven resource management, server and path selection, collaboration-friendly ML and AI algorithms, and privacy and security issues associated with distributed data.
This special session solicits contributions that tackle these important and timely challenges, for future multimedia systems. Topics of interest include, but are not limited to:
- Edge/cloud-based delay management for multimedia communication
- Collaborative intelligence on cloud, edge, and end devices
- Distributed processing of multimedia over edge and cloud servers
- Machine learning and inference techniques for distributed and collaborative multimedia processing
- Power/Energy minimization via edge, cloud, and local computing
- Network resource allocation for efficient utilization in multimedia services
- Quality assessment of visual content in a collaborative network environment
- Security and privacy of multimedia data in distributed and collaborative environments
- Autonomous endpoint multimedia ML models