Sunday, October 17, 2021

ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

NOSSDAV’21: The 31st edition of the Workshop on Network and Operating System Support for Digital Audio and Video
Sept. 28-Oct. 1, 2021, Istanbul, Turkey
Conference Website
[PDF][Slides][Video]

Reza FarahaniFarzad TashtarianAlireza ErfanianChristian Timmerer, Mohammad Ghanbari and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, 
Alpen-Adria-Universität Klagenfurt

Abstract: Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the originally requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.

Keywords: Dynamic Adaptive Streaming over HTTP (DASH), Edge Computing, Network-Assisted Video Streaming, Quality of Experience (QoE), Software Defined Networking (SDN), Network Function Virtualization (NFV)

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Tuesday, October 12, 2021

ACM Mile High Video (MHV) 2022

 ACM Mile High Video (MHV) 2022

March 1-3, 2022, Denver, CO

Deadline: Oct 22, 2021 (final)

After running as an independent event for several years, starting with 2022, Mile High Video (MHV) will be organized by the ACM Special Interest Group on Multimedia (SIGMM) to grow further. ACM MHV’22 will establish a unique forum for participants from both industry and academia to present, share and discuss innovations from content production to consumption.

ACM MHV’22 welcomes contributions from industry to share real-world problems and solutions as well as novel approaches and results from basic research typically conducted within an academic environment. ACM MHV’22 will provide a unique opportunity to view the interplay of the industry and academia in the area of video technologies.

ACM MHV contributions are solicited in, but not limited to the following areas:
• Content production, encoding and packaging
• Encoding for broadcast, mobile and OTT, and using AI/ML in encoding
• New and developing audio and video codecs
• HDR, accessibility
• Quality assessment models and tools, and user experience studies
• Workflows
• Virtualized headends, cloud-based workflows for production and distribution
• Redundancy and resilience in content origination
• Ingest protocols
• Ad insertion
• Content delivery and security
• Developments in transport protocols and new delivery paradigms
• Protection for OTT distribution and tools against piracy
• Analytics
• Streaming technologies
• Adaptive streaming and transcoding
• Low latency
• Player, playback and UX developments
• Content discovery, promotion and recommendation systems
• Protocol and Web API improvements and innovations for streaming video
• Industry trends
• Advances in interactive and immersive (xR) video
• Video coding for machines
• Cloud gaming and gaming streaming
• Provenance, content authentication and deepfakes
• Standards and interoperability
• New and developing standards in the media and delivery space
• Interoperability guidelines

Prospective speakers are invited to submit an abstract (i.e., approx. 400 words or up to one page using the ACM template) that will be peer-reviewed by the ACM MHV technical program committee (TPC) for relevance, timeliness and technical correctness.

The authors of the accepted abstracts will be invited to optionally submit a full-length paper (up to six pages + references) for possible inclusion into the conference proceedings. These papers must be original work (i.e., not published previously in a journal or conference) and will also be peer-reviewed by the ACM MHV TPC.

Accepted abstracts and full-length papers will be presented at the ACM MHV conference and will be published in the conference proceedings in the ACM Digital Library.

All prospective ACM authors are subject to all ACM Publications Policies, including ACM's new Publications Policy on Research Involving Human Participants and Subjects.

How to Submit an Abstract

Prospective authors are invited to submit an abstract here: https://mhv22.hotcrp.com/

Important Dates
• Abstract submission deadline: Oct. 22, 2021 (final)
• Notification of abstract acceptance: Nov. 15, 2021
• (Optional) Full-length paper submission deadline: Nov. 30, 2021
• Notification of full-length paper acceptance: Dec. 31, 2021
• Camera-ready submission (abstracts/full-length papers) deadline: Jan. 31, 2022

ACM MHV’22 Program Chairs
• Christian Timmerer (AAU; christian.timmerer AT aau.at)
• Dan Grois (Comcast; dgrois AT acm.org)

ACM MHV'22 Program Committee Members
• Florence Agboma (Sky, UK)
• Saba Ahsan (Nokia, Finland)
• Ali C. Begen (Ozyegin University, Turkey)
• Imed Bouazizi (Qualcomm, USA)
• Alan Bovik (University of Texas at Austin, USA)
• Pablo Cesar (CWI, The Netherlands)
• Pankaj Chaudhari (Hulu, USA)
• Luca De Cicco (Politecnico di Bari, Italy)
• Jan De Cock (Synamedia, Belgium)
• Thomas Edwards (Amazon Web Services, USA)
• Christian Feldmann (Bitmovin, Germany)
• Simone Ferlin-Reiter (Ericsson, Sweden)
• Carsten Griwodz (University of Oslo, Norway)
• Sally Hattori (Disney, USA)
• Carys Hughes (Sky, UK)
• Mourad Kioumgi (Sky, Germany)
• Will Law (Akamai, USA)
• Zhu Li (University of Missouri, Kansas City, USA)
• Zhi Li (Netflix, USA)
• John Luther (JW Player, USA)
• Maria Martini (Kingston University, UK)
• Rufael Mekuria (Unified Streaming, The Netherlands)
• Marta Mrak (BBC, UK)
• Matteo Naccari (Audinate, UK)
• Mark Nakano (WarnerMedia, USA)
• Sejin Oh (Dolby, USA)
• Mickael Raulet (ATEME, France)
• Christian Rothenberg (University of Campinas , Brazil)
• Lucile Sassatelli (Universite Cote d'Azur, France)
• Tamar Shoham (Beamr, Israel)
• Gwendal Simon (Synamedia, France)
• Lea Skorin-Kapov (University of Zagreb, Croatia)
• Michael Stattmann (castLabs, Germany)
• Nicolas Weil (Amazon Web Services, USA)
• Roger Zimmermann (NUS, Singapore)

ACM MHV Steering Committee Members
• Balu Adsumilli (YouTube, USA)
• Ali C. Begen (Ozyegin University, Turkey), Co-chair
• Alex Giladi (Comcast, USA), Co-chair
• Sally Hattori (Walt Disney Studios, USA)
• Jean-Baptiste Kempf (VideoLAN, France)
• Thomas Kernen (NVIDIA, Switzerland)
• Scott Labrozzi (Disney Streaming Services, USA)
• Maria Martini (Kingston University, UK)
• Hatice Memiguven (beIN Media, Turkey)
• Ben Mesander (Wowza Media Systems, USA)
• Mark Nakano (WarnerMedia, USA)
• Madeleine Noland (ATSC, USA)
• Yuriy Reznik (Brightcove, USA)
• Tamar Shoham (Beamr, Israel)

Friday, September 24, 2021

INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

[PDF]

Babak Taraghi, Minh Nguyen, Hadi Amirpour, Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: With the recent growth of multimedia traffic over the Internet and emerging multimedia streaming service providers, improving Quality of Experience (QoE) for HTTP Adaptive Streaming (HAS) becomes more important. Alongside other factors, such as the media quality, HAS relies on the performance of the media player’s Adaptive Bitrate (ABR) algorithm to optimize QoE in multimedia streaming sessions. QoE in HAS suffers from weak or unstable internet connections and suboptimal ABR decisions. As a result of imperfect adaptiveness to the characteristics and conditions of the internet connection, stall events and quality level switches could occur and with different durations that negatively affect the QoE. In this paper, we address various identified open issues related to the QoE for HAS, notably (i) the minimum noticeable duration for stall events in HAS; (ii) the correlation between the media quality and the impact of stall events on QoE; (iii) the end-user preference regarding multiple shorter stall events versus a single longer stall event; and (iv) the end-user preference of media quality switches over stall events. Therefore, we have studied these open issues from both objective and subjective evaluation perspectives and presented the correlation between the two types of evaluations. The findings documented in this paper can be used as a baseline for improving ABR algorithms and policies in HAS.

Keywords: Crowdsourcing; HTTP Adaptive Streaming; Quality of Experience; Quality Switches; Stall Events; Subjective Evaluation; Objective Evaluation.

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Wednesday, September 22, 2021

LwTE: Light-weight Transcoding at the Edge

 LwTE: Light-weight Transcoding at the Edge

IEEE ACCESS

[PDF]

Alireza Erfanian*, Hadi Amirpour*Farzad TashtarianChristian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

*These authors contributed equally to this work.

Abstract: Due to the growing demand for video streaming services, providers have to deal with increasing resource requirements for increasingly heterogeneous environments. To mitigate this problem, many works have been proposed which aim to (i) improve cloud/edge caching efficiency, (ii) use computation power available in the cloud/edge for on-the-fly transcoding, and (iii) optimize the trade-off among various cost parameters,e.g., storage, computation, and bandwidth. In this paper, we propose LwTE, a novel Light-weight Transcoding approach at the Edge, in the context of HTTP Adaptive Streaming (HAS). During the encoding process of a video segment at the origin side, computationally intense search processes are going on. The main idea of LwTE is to store the optimal results of these search processes as metadata for each video bitrate and reuse them at the edge servers to reduce the required time and computational resources for on-the-fly transcoding. LwTE enables us to store only the highest bitrate plus corresponding metadata (of very small size) for unpopular video segments/bitrates. In this way, in addition to the significant reduction in bandwidth and storage consumption, the required time for on-the-fly transcoding of a requested segment is remarkably decreased by utilizing its corresponding metadata; unnecessary search processes are avoided. Popular video segments/bitrates are being stored. We investigate our approach for Video-on-Demand (VoD) streaming services by optimizing storage and computation (transcoding) costs at the edge servers and then compare it to conventional methods (store all bitrates, partial transcoding). The results indicate that our approach reduces the transcoding time by at least 80% and decreases the aforementioned costs by 12% to 70% compared to the state-of-the-art approaches.

Keywords: Video streaming, transcoding, video on demand, edge computing.

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Monday, September 6, 2021

CTU Depth Decision Algorithms for HEVC: A Survey

CTU Depth Decision Algorithms for HEVC: A Survey

[PDF]

Ekrem Çetinkaya*, Hadi Amirpour*Mohammad Ghanbari,  and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

*These authors contributed equally to this work.

Abstract: High Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64 × 64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1 (AV1).

Keywords: HEVC, Coding Tree Unit, Complexity, CTU Partitioning, Statistics, Machine Learning

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.


Sunday, August 22, 2021

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Proceedings of the IEEE, vol. 109, no. 9, Sept. 2021

By CHRISTIAN TIMMERER, Senior Member IEEE
Guest Editor
MATHIAS WIEN, Member IEEE
Guest Editor
LU YU, Senior Member IEEE
Guest Editor
AMY REIBMAN, Fellow IEEE Guest Editor


Abstract
: Multimedia content (i.e., video, image, audio) is responsible for the majority of today’s Internet traffic and numbers are expecting to grow beyond 80% in the near future. For more than 30 years, international standards provide tools for interoperability and are both source and sink for challenging research activities in the domain of multimedia compression and system technologies. The goal of this special issue is to review those standards and focus on (i) the technology developed in the context of these standards and (ii) research questions addressing aspects of these standards which are left open for competition by both academia and industry.

Index Terms—Open Media Standards, MPEG, JPEG, JVET, AOM, Computational Complexity

C. Timmerer, M. Wien, L. Yu and A. Reibman, "Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1423-1434, Sept. 2021, doi: 10.1109/JPROC.2021.3098048.


A Technical Overview of AV1

J. Han et al., "A Technical Overview of AV1," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1435-1462, Sept. 2021, doi: 10.1109/JPROC.2021.3058584.

Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC)

B. Bross, J. Chen, J. -R. Ohm, G. J. Sullivan and Y. -K. Wang, "Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC)," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1463-1493, Sept. 2021, doi: 10.1109/JPROC.2020.3043399.

Abstract: In the last 17 years, since the finalization of the first version of the now-dominant H.264/Moving Picture Experts Group-4 (MPEG-4) Advanced Video Coding (AVC) standard in 2003, two major new generations of video coding standards have been developed. These include the standards known as High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC). HEVC was finalized in 2013, repeating the ten-year cycle time set by its predecessor and providing about 50% bit-rate reduction over AVC. The cycle was shortened by three years for the VVC project, which was finalized in July 2020, yet again achieving about a 50% bit-rate reduction over its predecessor (HEVC). This article summarizes these developments in video coding standardization after AVC. It especially focuses on providing an overview of the first version of VVC, including comparisons against HEVC. Besides further advances in hybrid video compression, as in previous development cycles, the broad versatility of the application domain that is highlighted in the title of VVC is explained. Included in VVC is the support for a wide range of applications beyond the typical standard- and high-definition camera-captured content codings, including features to support computer-generated/screen content, high dynamic range content, multilayer and multiview coding, and support for immersive media such as 360° video.

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

D. Ding, Z. Ma, D. Chen, Q. Chen, Z. Liu and F. Zhu, "Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1494-1520, Sept. 2021, doi: 10.1109/JPROC.2021.3059994.

Abstract: Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)-based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of a subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On postprocessing, we demonstrate two neural adaptive filters to, respectively, facilitate the in-loop and postfiltering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/.

MPEG Immersive Video Coding Standard

J. M. Boyce et al., "MPEG Immersive Video Coding Standard," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1521-1536, Sept. 2021, doi: 10.1109/JPROC.2021.3062590.

Abstract: This article introduces the ISO/IEC MPEG Immersive Video (MIV) standard, MPEG-I Part 12, which is undergoing standardization. The draft MIV standard provides support for viewing immersive volumetric content captured by multiple cameras with six degrees of freedom (6DoF) within a viewing space that is determined by the camera arrangement in the capture rig. The bitstream format and decoding processes of the draft specification along with aspects of the Test Model for Immersive Video (TMIV) reference software encoder, decoder, and renderer are described. The use cases, test conditions, quality assessment methods, and experimental results are provided. In the TMIV, multiple texture and geometry views are coded as atlases of patches using a legacy 2-D video codec, while optimizing for bitrate, pixel rate, and quality. The design of the bitstream format and decoder is based on the visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC) standard, MPEG-I Part 5.

Compression of Sparse and Dense Dynamic Point Clouds—Methods and Standards

C. Cao, M. Preda, V. Zakharchenko, E. S. Jang and T. Zaharia, "Compression of Sparse and Dense Dynamic Point Clouds—Methods and Standards," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1537-1558, Sept. 2021, doi: 10.1109/JPROC.2021.3085957.

Abstract: In this article, a survey of the point cloud compression (PCC) methods by organizing them with respect to the data structure, coding representation space, and prediction strategies is presented. Two paramount families of approaches reported in the literature—the projection- and octree-based methods—are proven to be efficient for encoding dense and sparse point clouds, respectively. These approaches are the pillars on which the Moving Picture Experts Group Committee developed two PCC standards published as final international standards in 2020 and early 2021, respectively, under the names: video-based PCC and geometry-based PCC. After surveying the current approaches for PCC, the technologies underlying the two standards are described in detail from an encoder perspective, providing guidance for potential standard implementors. In addition, experiment evaluations in terms of compression performances for both solutions are provided.

JPEG XS—A New Standard for Visually Lossless Low-Latency Lightweight Image Coding

A. Descampe et al., "JPEG XS—A New Standard for Visually Lossless Low-Latency Lightweight Image Coding," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1559-1577, Sept. 2021, doi: 10.1109/JPROC.2021.3080916.

Abstract: Joint Photographic Experts Group (JPEG) XS is a new International Standard from the JPEG Committee (formally known as ISO/International Electrotechnical Commission (IEC) JTC1/SC29/WG1). It defines an interoperable, visually lossless low-latency lightweight image coding that can be used for mezzanine compression within any AV market. Among the targeted use cases, one can cite video transport over professional video links (serial digital interface (SDI), internet protocol (IP), and Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example, in cameras and the automotive industry). The core coding system is composed of an optional color transform, a wavelet transform, and a novel entropy encoder, processing groups of coefficients by coding their magnitude level and packing the magnitude refinement. Such a design allows for visually transparent quality at moderate compression ratios, scalable end-to-end latency that ranges from less than one line to a maximum of 32 lines of the image, and a low-complexity real-time implementation in application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), central processing unit (CPU), and graphics processing unit (GPU). This article details the key features of this new standard and the profiles and formats that have been defined so far for the various applications. It also gives a technical description of the core coding system. Finally, the latest performance evaluation results of recent implementations of the standard are presented, followed by the current status of the ongoing standardization process and future milestones.

MPEG Standards for Compressed Representation of Immersive Audio

S. R. Quackenbush and J. Herre, "MPEG Standards for Compressed Representation of Immersive Audio," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1578-1589, Sept. 2021, doi: 10.1109/JPROC.2021.3075390.

Abstract: The term “immersive audio” is frequently used to describe an audio experience that provides the listener the sensation of being fully immersed or “present” in a sound scene. This can be achieved via different presentation modes, such as surround sound (several loudspeakers horizontally arranged around the listener), 3D audio (with loudspeakers at, above, and below listener ear level), and binaural audio to headphones. This article provides an overview of two recent standards that support the bitrate-efficient carriage of high-quality immersive sound. The first is MPEG-H 3D audio, which is a versatile standard that supports multiple immersive sound signal formats (channels, objects, and higher order ambisonics) and is now being adopted in broadcast and streaming applications. The second is MPEG-I immersive audio, an extension of 3D audio, currently under development, which is targeted for virtual and augmented reality applications. This will support rendering of fully user-interactive immersive sound for three degrees of user movement [three degrees of freedom (3DoF)], i.e., yaw, pitch, and roll head movement, and for six degrees of user movement [six degrees of freedom (6DoF)], i.e., 3DoF plus translational x, y, and z user position movements.

An Overview of Omnidirectional MediA Format (OMAF)

M. M. Hannuksela and Y. -K. Wang, "An Overview of Omnidirectional MediA Format (OMAF)," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1590-1606, Sept. 2021, doi: 10.1109/JPROC.2021.3063544.

Abstract: During recent years, there have been product launches and research for enabling immersive audio–visual media experiences. For example, a variety of head-mounted displays and 360° cameras are available in the market. To facilitate interoperability between devices and media system components by different vendors, the Moving Picture Experts Group (MPEG) developed the Omnidirectional MediA Format (OMAF), which is arguably the first virtual reality (VR) system standard. OMAF is a storage and streaming format for omnidirectional media, including 360° video and images, spatial audio, and associated timed text. This article provides a comprehensive overview of OMAF.

An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data

J. Voges, M. Hernaez, M. Mattavelli and J. Ostermann, "An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1607-1622, Sept. 2021, doi: 10.1109/JPROC.2021.3082027.

Abstract: The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than $ 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 “Biotechnology” has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.

Saturday, August 21, 2021

MPEG AG 5 Workshop on Quality of Immersive Media: Assessment and Metrics

The Quality of Experience (QoE) is well-defined in QUALINET white papers [here, here], but its assessment and metrics are subject to research. The aim of this workshop on “Quality of Immersive Media: Assessment and Metrics” is to provide a forum for researchers and practitioners to discuss the latest findings in this field. The scope of this workshop is (i) to raise awareness about MPEG efforts in the context of quality of immersive visual media and (ii) invite experts (outside of MPEG) to present new techniques relevant to this workshop.

Quality assessments in the context of the MPEG standardization process typically serve two purposes: (1) to foster decision-making on the tool adoptions during the standardization process and (2) to validate the outcome of a standardization effort compared to an established anchor (i.e., for verification testing).

We kindly invite you to the first online MPEG AG 5 Workshop on Quality of Immersive Media: Assessment and Metrics as follows.

Logistics (online):

Program/Speakers:

15:00-15:10: Joel Jung & Christian Timmerer (AhG co-chairs): Welcome notice

15:10-15:30: Mathias Wien (AG 5 convenor): MPEG Visual Quality Assessment: Tasks and Perspectives
Abstract: The Advisory Group on MPEG Visual Quality Assessment (ISO/IEC JTC1 SC29/AG5) has been founded in 2020 with the goal to select and design subjective quality evaluation methodologies and objective quality metrics for the assessment of visual coding technologies in the context of the MPEG standardization work. In this talk, the current work items, as well as perspectives and first achievements of the group, are presented.

15:30-15:50: Aljosa Smolic: Perception and Quality of Immersive Media
Abstract: Interest in immersive media increased significantly over recent years. Besides applications in entertainment, culture, health, industry, etc., telepresence and remote collaboration gained importance due to the pandemic and climate crisis. Immersive media have the potential to increase social integration and to reduce greenhouse gas emissions. As a result, technologies along the whole pipeline from capture to display are maturing and applications are becoming available, creating business opportunities. One aspect of immersive technologies that is still relatively undeveloped is the understanding of perception and quality, including subjective and objective assessment. The interactive nature of immersive media poses new challenges to estimation of saliency or visual attention, and to the development of quality metrics. The V-SENSE lab of Trinity College Dublin addresses these questions in current research. This talk will highlight corresponding examples in 360 VR video, light fields, volumetric video and XR.

15:50-16:00: Break/Discussions

16:00-16:20: Jesús Gutiérrez: Quality assessment of immersive media: Recent activities within VQEG
Abstract: This presentation will provide an overview of the recent activities carried out on quality assessment of immersive media within the Video Quality Experts Group (VQEG), particularly within the Immersive Media Group (IMG). Among other efforts, outcomes will be presented from the cross-lab test (carried out by ten different labs) in order to assess and validate subjective evaluation methodologies for 360º videos, which was instrumental in the development of the ITU-T Recommendation P.919. Also, insights will be provided on the current plans on exploring the evaluation of the quality of experience of immersive communication systems, considering different technologies such as 360º video, point cloud, free-viewpoint video, etc.

16:20-16:40: Alexander Raake: Perceptual evaluation of Immersive Media - from video quality towards a holistic QoE perspective
Abstract: Immersive visual media spans from higher-resolution video with increased field of view to fully interactive extended reality (XR) systems based on VR, AR, or MR technology. Here, quality and Quality of Experience (QoE) evaluation are key to ensure valuable experiences for the users and thus successful technology developments. The talk presents some work in ITU-T SG 12 on the assessment of immersive media, and corresponding contributions and other related research activities by the Audiovisual Technology (AVT) group at TU Ilmenau. In the first part of the talk, the quality model series P.1203 and P.1204 for resolutions of up to 4K/UHD1 will be presented, with a primary focus on the bitstream-based models P.1203.1 and P.1204.3. Besides their application to 2D video, their usage for gaming-video and 360° video quality assessment are addressed. In the second part, the talk discusses aspects of QoE for immersive media that go beyond visual quality. Research is presented on the exploration behavior of users for 360° video, showing the influence due to the content as well as the task given to the subjects. Furthermore, some recent work on presence and cybersickness evaluation for 360° video is discussed. The talk concludes with an outlook on using indirect methods and cognitive performances as evaluation criteria for audiovisual IVEs.

16:40-17:00: Laura Toni: Understanding user interactivity for immersive communications and its impact on QoE 
Abstract: A major challenge for the next decade is to design virtual and augmented reality systems for real-world use cases such as healthcare, entertainment, e-education, and high-risk missions. This requires immersive systems that operate at scale, in a personalized manner, remaining bandwidth-tolerant whilst meeting quality and latency criteria. This can be accomplished only by a  fundamental revolution of the network and immersive systems that has to put the interactive user at the heart of the system rather than at the end of the chain. With this goal in mind, in this talk, we provide an overview of our current researches on the behaviour of interactive users in immersive experiences and its impact on the next-generation multimedia systems. We present novel tools for behavioural analysis of users navigating in 3-DoF and 6-DoF systems, we show the impact and advantages of taking into account user behaviour in immersive systems. We then conclude with a perspective on the impact of users behaviour studies into QoE. 

17:00: Conclusions