Tuesday, October 19, 2021

Understanding Quality of Experience of Heuristic-based HTTP Adaptive Bitrate Algorithms

Understanding Quality of Experience of Heuristic-based HTTP Adaptive Bitrate Algorithms

NOSSDAV’21: The 31st edition of the Workshop on Network and Operating System Support for Digital Audio and Video
Sept. 28-Oct. 1, 2021, Istanbul, Turkey
Conference Website

Babak Taraghi*, Abdelhak Bentaleb**, Christian Timmerer*, Roger Zimmermann** and Hermann Hellwagner*
* Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt
** National University of Singapore

Abstract: Adaptive BitRate (ABR) algorithms play a crucial role in delivering the highest possible viewer’s Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS – the dominant video streaming technique on the Internet – to deliver the best QoE for their users. Viewer’s delightfulness relies heavily on how the ABR of a media player can adapt the stream’s quality to the current network conditions. QoE for end-to-end video streaming sessions has been evaluated in many research projects to give better insight into the quality metrics. Objective evaluation models such as ITU Telecommunication Standardization Sector (ITU-T) P.1203 allow for the calculation of Mean Opinion Score (MOS) by considering various QoE metrics, and subjective evaluation is the best assessment approach in investigating the end-user opinion over a video streaming session’s experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper’s main contribution is subjective evaluation analogy with objective evaluation for well-known heuristic-based ABRs.

Keywords: HTTP Adaptive Streaming, ABR Algorithms, Quality of Experience, Crowdsourcing, Subjective Evaluation, Objective Evaluation, MOS, (ITU-T) P.1203

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Sunday, October 17, 2021

ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

NOSSDAV’21: The 31st edition of the Workshop on Network and Operating System Support for Digital Audio and Video
Sept. 28-Oct. 1, 2021, Istanbul, Turkey
Conference Website
[PDF][Slides][Video]

Reza FarahaniFarzad TashtarianAlireza ErfanianChristian Timmerer, Mohammad Ghanbari and Hermann Hellwagner
Christian Doppler Laboratory ATHENA, 
Alpen-Adria-Universität Klagenfurt

Abstract: Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the originally requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.

Keywords: Dynamic Adaptive Streaming over HTTP (DASH), Edge Computing, Network-Assisted Video Streaming, Quality of Experience (QoE), Software Defined Networking (SDN), Network Function Virtualization (NFV)

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Tuesday, October 12, 2021

ACM Mile High Video (MHV) 2022

 ACM Mile High Video (MHV) 2022

March 1-3, 2022, Denver, CO

Deadline: Oct 22, 2021 (final)

After running as an independent event for several years, starting with 2022, Mile High Video (MHV) will be organized by the ACM Special Interest Group on Multimedia (SIGMM) to grow further. ACM MHV’22 will establish a unique forum for participants from both industry and academia to present, share and discuss innovations from content production to consumption.

ACM MHV’22 welcomes contributions from industry to share real-world problems and solutions as well as novel approaches and results from basic research typically conducted within an academic environment. ACM MHV’22 will provide a unique opportunity to view the interplay of the industry and academia in the area of video technologies.

ACM MHV contributions are solicited in, but not limited to the following areas:
• Content production, encoding and packaging
• Encoding for broadcast, mobile and OTT, and using AI/ML in encoding
• New and developing audio and video codecs
• HDR, accessibility
• Quality assessment models and tools, and user experience studies
• Workflows
• Virtualized headends, cloud-based workflows for production and distribution
• Redundancy and resilience in content origination
• Ingest protocols
• Ad insertion
• Content delivery and security
• Developments in transport protocols and new delivery paradigms
• Protection for OTT distribution and tools against piracy
• Analytics
• Streaming technologies
• Adaptive streaming and transcoding
• Low latency
• Player, playback and UX developments
• Content discovery, promotion and recommendation systems
• Protocol and Web API improvements and innovations for streaming video
• Industry trends
• Advances in interactive and immersive (xR) video
• Video coding for machines
• Cloud gaming and gaming streaming
• Provenance, content authentication and deepfakes
• Standards and interoperability
• New and developing standards in the media and delivery space
• Interoperability guidelines

Prospective speakers are invited to submit an abstract (i.e., approx. 400 words or up to one page using the ACM template) that will be peer-reviewed by the ACM MHV technical program committee (TPC) for relevance, timeliness and technical correctness.

The authors of the accepted abstracts will be invited to optionally submit a full-length paper (up to six pages + references) for possible inclusion into the conference proceedings. These papers must be original work (i.e., not published previously in a journal or conference) and will also be peer-reviewed by the ACM MHV TPC.

Accepted abstracts and full-length papers will be presented at the ACM MHV conference and will be published in the conference proceedings in the ACM Digital Library.

All prospective ACM authors are subject to all ACM Publications Policies, including ACM's new Publications Policy on Research Involving Human Participants and Subjects.

How to Submit an Abstract

Prospective authors are invited to submit an abstract here: https://mhv22.hotcrp.com/

Important Dates
• Abstract submission deadline: Oct. 22, 2021 (final)
• Notification of abstract acceptance: Nov. 15, 2021
• (Optional) Full-length paper submission deadline: Nov. 30, 2021
• Notification of full-length paper acceptance: Dec. 31, 2021
• Camera-ready submission (abstracts/full-length papers) deadline: Jan. 31, 2022

ACM MHV’22 Program Chairs
• Christian Timmerer (AAU; christian.timmerer AT aau.at)
• Dan Grois (Comcast; dgrois AT acm.org)

ACM MHV'22 Program Committee Members
• Florence Agboma (Sky, UK)
• Saba Ahsan (Nokia, Finland)
• Ali C. Begen (Ozyegin University, Turkey)
• Imed Bouazizi (Qualcomm, USA)
• Alan Bovik (University of Texas at Austin, USA)
• Pablo Cesar (CWI, The Netherlands)
• Pankaj Chaudhari (Hulu, USA)
• Luca De Cicco (Politecnico di Bari, Italy)
• Jan De Cock (Synamedia, Belgium)
• Thomas Edwards (Amazon Web Services, USA)
• Christian Feldmann (Bitmovin, Germany)
• Simone Ferlin-Reiter (Ericsson, Sweden)
• Carsten Griwodz (University of Oslo, Norway)
• Sally Hattori (Disney, USA)
• Carys Hughes (Sky, UK)
• Mourad Kioumgi (Sky, Germany)
• Will Law (Akamai, USA)
• Zhu Li (University of Missouri, Kansas City, USA)
• Zhi Li (Netflix, USA)
• John Luther (JW Player, USA)
• Maria Martini (Kingston University, UK)
• Rufael Mekuria (Unified Streaming, The Netherlands)
• Marta Mrak (BBC, UK)
• Matteo Naccari (Audinate, UK)
• Mark Nakano (WarnerMedia, USA)
• Sejin Oh (Dolby, USA)
• Mickael Raulet (ATEME, France)
• Christian Rothenberg (University of Campinas , Brazil)
• Lucile Sassatelli (Universite Cote d'Azur, France)
• Tamar Shoham (Beamr, Israel)
• Gwendal Simon (Synamedia, France)
• Lea Skorin-Kapov (University of Zagreb, Croatia)
• Michael Stattmann (castLabs, Germany)
• Nicolas Weil (Amazon Web Services, USA)
• Roger Zimmermann (NUS, Singapore)

ACM MHV Steering Committee Members
• Balu Adsumilli (YouTube, USA)
• Ali C. Begen (Ozyegin University, Turkey), Co-chair
• Alex Giladi (Comcast, USA), Co-chair
• Sally Hattori (Walt Disney Studios, USA)
• Jean-Baptiste Kempf (VideoLAN, France)
• Thomas Kernen (NVIDIA, Switzerland)
• Scott Labrozzi (Disney Streaming Services, USA)
• Maria Martini (Kingston University, UK)
• Hatice Memiguven (beIN Media, Turkey)
• Ben Mesander (Wowza Media Systems, USA)
• Mark Nakano (WarnerMedia, USA)
• Madeleine Noland (ATSC, USA)
• Yuriy Reznik (Brightcove, USA)
• Tamar Shoham (Beamr, Israel)

Friday, September 24, 2021

INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

[PDF]

Babak Taraghi, Minh Nguyen, Hadi Amirpour, Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: With the recent growth of multimedia traffic over the Internet and emerging multimedia streaming service providers, improving Quality of Experience (QoE) for HTTP Adaptive Streaming (HAS) becomes more important. Alongside other factors, such as the media quality, HAS relies on the performance of the media player’s Adaptive Bitrate (ABR) algorithm to optimize QoE in multimedia streaming sessions. QoE in HAS suffers from weak or unstable internet connections and suboptimal ABR decisions. As a result of imperfect adaptiveness to the characteristics and conditions of the internet connection, stall events and quality level switches could occur and with different durations that negatively affect the QoE. In this paper, we address various identified open issues related to the QoE for HAS, notably (i) the minimum noticeable duration for stall events in HAS; (ii) the correlation between the media quality and the impact of stall events on QoE; (iii) the end-user preference regarding multiple shorter stall events versus a single longer stall event; and (iv) the end-user preference of media quality switches over stall events. Therefore, we have studied these open issues from both objective and subjective evaluation perspectives and presented the correlation between the two types of evaluations. The findings documented in this paper can be used as a baseline for improving ABR algorithms and policies in HAS.

Keywords: Crowdsourcing; HTTP Adaptive Streaming; Quality of Experience; Quality Switches; Stall Events; Subjective Evaluation; Objective Evaluation.

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Wednesday, September 22, 2021

LwTE: Light-weight Transcoding at the Edge

 LwTE: Light-weight Transcoding at the Edge

IEEE ACCESS

[PDF]

Alireza Erfanian*, Hadi Amirpour*Farzad TashtarianChristian Timmerer, Hermann Hellwagner
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

*These authors contributed equally to this work.

Abstract: Due to the growing demand for video streaming services, providers have to deal with increasing resource requirements for increasingly heterogeneous environments. To mitigate this problem, many works have been proposed which aim to (i) improve cloud/edge caching efficiency, (ii) use computation power available in the cloud/edge for on-the-fly transcoding, and (iii) optimize the trade-off among various cost parameters,e.g., storage, computation, and bandwidth. In this paper, we propose LwTE, a novel Light-weight Transcoding approach at the Edge, in the context of HTTP Adaptive Streaming (HAS). During the encoding process of a video segment at the origin side, computationally intense search processes are going on. The main idea of LwTE is to store the optimal results of these search processes as metadata for each video bitrate and reuse them at the edge servers to reduce the required time and computational resources for on-the-fly transcoding. LwTE enables us to store only the highest bitrate plus corresponding metadata (of very small size) for unpopular video segments/bitrates. In this way, in addition to the significant reduction in bandwidth and storage consumption, the required time for on-the-fly transcoding of a requested segment is remarkably decreased by utilizing its corresponding metadata; unnecessary search processes are avoided. Popular video segments/bitrates are being stored. We investigate our approach for Video-on-Demand (VoD) streaming services by optimizing storage and computation (transcoding) costs at the edge servers and then compare it to conventional methods (store all bitrates, partial transcoding). The results indicate that our approach reduces the transcoding time by at least 80% and decreases the aforementioned costs by 12% to 70% compared to the state-of-the-art approaches.

Keywords: Video streaming, transcoding, video on demand, edge computing.

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Monday, September 6, 2021

CTU Depth Decision Algorithms for HEVC: A Survey

CTU Depth Decision Algorithms for HEVC: A Survey

[PDF]

Ekrem Çetinkaya*, Hadi Amirpour*Mohammad Ghanbari,  and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

*These authors contributed equally to this work.

Abstract: High Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64 × 64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1 (AV1).

Keywords: HEVC, Coding Tree Unit, Complexity, CTU Partitioning, Statistics, Machine Learning

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.


Sunday, August 22, 2021

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Proceedings of the IEEE, vol. 109, no. 9, Sept. 2021

By CHRISTIAN TIMMERER, Senior Member IEEE
Guest Editor
MATHIAS WIEN, Member IEEE
Guest Editor
LU YU, Senior Member IEEE
Guest Editor
AMY REIBMAN, Fellow IEEE Guest Editor


Abstract
: Multimedia content (i.e., video, image, audio) is responsible for the majority of today’s Internet traffic and numbers are expecting to grow beyond 80% in the near future. For more than 30 years, international standards provide tools for interoperability and are both source and sink for challenging research activities in the domain of multimedia compression and system technologies. The goal of this special issue is to review those standards and focus on (i) the technology developed in the context of these standards and (ii) research questions addressing aspects of these standards which are left open for competition by both academia and industry.

Index Terms—Open Media Standards, MPEG, JPEG, JVET, AOM, Computational Complexity

C. Timmerer, M. Wien, L. Yu and A. Reibman, "Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1423-1434, Sept. 2021, doi: 10.1109/JPROC.2021.3098048.


A Technical Overview of AV1

J. Han et al., "A Technical Overview of AV1," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1435-1462, Sept. 2021, doi: 10.1109/JPROC.2021.3058584.

Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC)

B. Bross, J. Chen, J. -R. Ohm, G. J. Sullivan and Y. -K. Wang, "Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC)," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1463-1493, Sept. 2021, doi: 10.1109/JPROC.2020.3043399.

Abstract: In the last 17 years, since the finalization of the first version of the now-dominant H.264/Moving Picture Experts Group-4 (MPEG-4) Advanced Video Coding (AVC) standard in 2003, two major new generations of video coding standards have been developed. These include the standards known as High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC). HEVC was finalized in 2013, repeating the ten-year cycle time set by its predecessor and providing about 50% bit-rate reduction over AVC. The cycle was shortened by three years for the VVC project, which was finalized in July 2020, yet again achieving about a 50% bit-rate reduction over its predecessor (HEVC). This article summarizes these developments in video coding standardization after AVC. It especially focuses on providing an overview of the first version of VVC, including comparisons against HEVC. Besides further advances in hybrid video compression, as in previous development cycles, the broad versatility of the application domain that is highlighted in the title of VVC is explained. Included in VVC is the support for a wide range of applications beyond the typical standard- and high-definition camera-captured content codings, including features to support computer-generated/screen content, high dynamic range content, multilayer and multiview coding, and support for immersive media such as 360° video.

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

D. Ding, Z. Ma, D. Chen, Q. Chen, Z. Liu and F. Zhu, "Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1494-1520, Sept. 2021, doi: 10.1109/JPROC.2021.3059994.

Abstract: Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)-based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of a subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On postprocessing, we demonstrate two neural adaptive filters to, respectively, facilitate the in-loop and postfiltering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/.

MPEG Immersive Video Coding Standard

J. M. Boyce et al., "MPEG Immersive Video Coding Standard," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1521-1536, Sept. 2021, doi: 10.1109/JPROC.2021.3062590.

Abstract: This article introduces the ISO/IEC MPEG Immersive Video (MIV) standard, MPEG-I Part 12, which is undergoing standardization. The draft MIV standard provides support for viewing immersive volumetric content captured by multiple cameras with six degrees of freedom (6DoF) within a viewing space that is determined by the camera arrangement in the capture rig. The bitstream format and decoding processes of the draft specification along with aspects of the Test Model for Immersive Video (TMIV) reference software encoder, decoder, and renderer are described. The use cases, test conditions, quality assessment methods, and experimental results are provided. In the TMIV, multiple texture and geometry views are coded as atlases of patches using a legacy 2-D video codec, while optimizing for bitrate, pixel rate, and quality. The design of the bitstream format and decoder is based on the visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC) standard, MPEG-I Part 5.

Compression of Sparse and Dense Dynamic Point Clouds—Methods and Standards

C. Cao, M. Preda, V. Zakharchenko, E. S. Jang and T. Zaharia, "Compression of Sparse and Dense Dynamic Point Clouds—Methods and Standards," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1537-1558, Sept. 2021, doi: 10.1109/JPROC.2021.3085957.

Abstract: In this article, a survey of the point cloud compression (PCC) methods by organizing them with respect to the data structure, coding representation space, and prediction strategies is presented. Two paramount families of approaches reported in the literature—the projection- and octree-based methods—are proven to be efficient for encoding dense and sparse point clouds, respectively. These approaches are the pillars on which the Moving Picture Experts Group Committee developed two PCC standards published as final international standards in 2020 and early 2021, respectively, under the names: video-based PCC and geometry-based PCC. After surveying the current approaches for PCC, the technologies underlying the two standards are described in detail from an encoder perspective, providing guidance for potential standard implementors. In addition, experiment evaluations in terms of compression performances for both solutions are provided.

JPEG XS—A New Standard for Visually Lossless Low-Latency Lightweight Image Coding

A. Descampe et al., "JPEG XS—A New Standard for Visually Lossless Low-Latency Lightweight Image Coding," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1559-1577, Sept. 2021, doi: 10.1109/JPROC.2021.3080916.

Abstract: Joint Photographic Experts Group (JPEG) XS is a new International Standard from the JPEG Committee (formally known as ISO/International Electrotechnical Commission (IEC) JTC1/SC29/WG1). It defines an interoperable, visually lossless low-latency lightweight image coding that can be used for mezzanine compression within any AV market. Among the targeted use cases, one can cite video transport over professional video links (serial digital interface (SDI), internet protocol (IP), and Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example, in cameras and the automotive industry). The core coding system is composed of an optional color transform, a wavelet transform, and a novel entropy encoder, processing groups of coefficients by coding their magnitude level and packing the magnitude refinement. Such a design allows for visually transparent quality at moderate compression ratios, scalable end-to-end latency that ranges from less than one line to a maximum of 32 lines of the image, and a low-complexity real-time implementation in application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), central processing unit (CPU), and graphics processing unit (GPU). This article details the key features of this new standard and the profiles and formats that have been defined so far for the various applications. It also gives a technical description of the core coding system. Finally, the latest performance evaluation results of recent implementations of the standard are presented, followed by the current status of the ongoing standardization process and future milestones.

MPEG Standards for Compressed Representation of Immersive Audio

S. R. Quackenbush and J. Herre, "MPEG Standards for Compressed Representation of Immersive Audio," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1578-1589, Sept. 2021, doi: 10.1109/JPROC.2021.3075390.

Abstract: The term “immersive audio” is frequently used to describe an audio experience that provides the listener the sensation of being fully immersed or “present” in a sound scene. This can be achieved via different presentation modes, such as surround sound (several loudspeakers horizontally arranged around the listener), 3D audio (with loudspeakers at, above, and below listener ear level), and binaural audio to headphones. This article provides an overview of two recent standards that support the bitrate-efficient carriage of high-quality immersive sound. The first is MPEG-H 3D audio, which is a versatile standard that supports multiple immersive sound signal formats (channels, objects, and higher order ambisonics) and is now being adopted in broadcast and streaming applications. The second is MPEG-I immersive audio, an extension of 3D audio, currently under development, which is targeted for virtual and augmented reality applications. This will support rendering of fully user-interactive immersive sound for three degrees of user movement [three degrees of freedom (3DoF)], i.e., yaw, pitch, and roll head movement, and for six degrees of user movement [six degrees of freedom (6DoF)], i.e., 3DoF plus translational x, y, and z user position movements.

An Overview of Omnidirectional MediA Format (OMAF)

M. M. Hannuksela and Y. -K. Wang, "An Overview of Omnidirectional MediA Format (OMAF)," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1590-1606, Sept. 2021, doi: 10.1109/JPROC.2021.3063544.

Abstract: During recent years, there have been product launches and research for enabling immersive audio–visual media experiences. For example, a variety of head-mounted displays and 360° cameras are available in the market. To facilitate interoperability between devices and media system components by different vendors, the Moving Picture Experts Group (MPEG) developed the Omnidirectional MediA Format (OMAF), which is arguably the first virtual reality (VR) system standard. OMAF is a storage and streaming format for omnidirectional media, including 360° video and images, spatial audio, and associated timed text. This article provides a comprehensive overview of OMAF.

An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data

J. Voges, M. Hernaez, M. Mattavelli and J. Ostermann, "An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data," in Proceedings of the IEEE, vol. 109, no. 9, pp. 1607-1622, Sept. 2021, doi: 10.1109/JPROC.2021.3082027.

Abstract: The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than $ 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 “Biotechnology” has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.