Multimedia Communication: cdgathena

Showing posts with label cdgathena. Show all posts

Monday, May 22, 2023

Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming

2022 Picture Coding Symposium (PCS)

December 7-9, 2022 | San Jose, CA, USA

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Prajit T Rajendran (Universite Paris-Saclay, France), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: In live streaming applications, a fixed set of bitrate-resolution pairs (known as bitrate ladder) is generally used to avoid additional pre-processing run-time to analyze the complexity of every video content and determine the optimized bitrate ladder. Furthermore, live encoders use the fastest available preset for encoding to ensure the minimum possible latency in streaming. For live encoders, it is expected that the encoding speed is equal to the video framerate. However, an optimized encoding preset may result in (i) increased Quality of Experience (QoE) and (ii) improved CPU utilization while encoding. In this light, this paper introduces a Content-Adaptive encoder Preset prediction Scheme (CAPS) for adaptive live video streaming applications. In this scheme, the encoder preset is determined using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for every video segment, the number of CPU threads allocated for each encoding instance, and the target encoding speed. Experimental results show that CAPS yields an overall quality improvement of 0.83 dB PSNR and 3.81 VMAF with the same bitrate, compared to the fastest preset encoding of the HTTP Live Streaming (HLS) bitrate ladder using x265 HEVC open-source encoder. This is achieved by maintaining the desired encoding speed and reducing CPU idle time.

CAPS_Presentation.pdf from Vignesh V Menon

Acknowledgements: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Friday, May 19, 2023

IEEE TIP: Advanced Scalability for Light Field Image Coding

Advanced Scalability for Light Field Image Coding

IEEE Transactions on Image Processing (TIP)

Journal Website

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Christine Guillemot (INRIA, France), Mohammad Ghanbari (University of Essex, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: Light field imaging, which captures both spatial and angular information, improves user immersion by enabling post-capture actions, such as refocusing and changing view perspective. However, light fields represent very large volumes of data with a lot of redundancy that coding methods try to remove. State-of-the-art coding methods indeed usually focus on improving compression efficiency and overlook other important features in light field compression such as scalability. In this paper, we propose a novel light field image compression method that enables (i) viewport scalability, (ii) quality scalability, (iii) spatial scalability, (iv) random access, and (v) uniform quality distribution among viewports, while keeping compression efficiency high. To this end, light fields in each spatial resolution are divided into sequential viewport layers, and viewports in each layer are encoded using the previously encoded viewports. In each viewport layer, the available viewports are used to synthesize intermediate viewports using a video interpolation deep learning network. The synthesized views are used as virtual reference images to enhance the quality of intermediate views. An image super-resolution method is applied to improve the quality of the lower spatial resolution layer. The super-resolved images are also used as virtual reference images to improve the quality of the higher spatial resolution layer.
The proposed structure also improves the flexibility of light field streaming, provides random access to the viewports, and increases error resiliency. The experimental results demonstrate that the proposed method achieves a high compression efficiency and it can adapt to the display type, transmission channel, network condition, processing power, and user needs.

Keywords—Light field, compression, scalability, random access, deep learning.

Tuesday, April 11, 2023

Hybrid P2P-CDN Architecture for Live Video Streaming: An Online Learning Approach

IEEE Global Communications Conference (GLOBECOM)

December 4-8, 2022 |Rio de Janeiro, Brazil

Conference Website

[PDF][Slides]

Reza Farahani (Alpen-Adria-Universität Klagenfurt, Austria), Abdelhak Bentaleb (National University of Singapore, Singapore), Ekrem Cetinkaya (Alpen-Adria-Universität Klagenfurt, Austria), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria), Roger Zimmermann (National University of Singapore, Singapore), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: a cost-effective, scalable, and flexible architecture that supports low latency and high-quality live video streaming is still a challenge for Over-The-Top (OTT) service providers. To cope with this issue, this paper leverages Peer-to-Peer (P2P), Content Delivery Network (CDN), edge computing, Network Function Virtualization (NFV), and distributed video transcoding paradigms to introduce a hybRId P2P-CDN arcHiTecture for livE video stReaming (RICHTER). We first introduce RICHTER’s multi-layer architecture and design an action tree that considers all feasible resources provided by peers, edge, and CDN servers for serving peer requests with minimum latency and maximum quality. We then formulate the problem as an optimization model executed at the edge of the network. We present an Online Learning (OL) approach that leverages an unsupervised Self Organizing Map (SOM) to (i) alleviate the time complexity issue of the optimization model and (ii) make it a suitable solution for large-scale scenarios by enabling decisions for groups of requests instead of for single requests. Finally, we implement the RICHTER framework, conduct our experiments on a large-scale cloud-based testbed including 350 HAS players, and compare its effectiveness with baseline systems. The experimental results illustrate that RICHTER outperforms baseline schemes in terms of users’ Quality of Experience (QoE), latency, and network utilization, by at least 59%, 39%, and 70%, respectively.

Index Terms—HAS; Edge Computing; NFV; CDN; P2P; Low Latency; QoE; Video Transcoding; Online Learning.

Acknowledgments: The financial support of the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association, is gratefully acknowledged. Christian Doppler Laboratory ATHENA: https://athena.itec.aau.at/.

Friday, April 7, 2023

Elsevier Signal Processing: Reversible Data Hiding for Color Images Based on Pixel Value Order of Overall Process Channel Correlation

Reversible Data Hiding for Color Images Based on Pixel Value Order of Overall Process Channel Correlation

Elsevier Signal Processing

[pdf]

Journal Website

Ningxiong Mao (Southwest Jiaotong University), Hongjie Hea (Southwest Jiaotong University), Fan Chenb (Southwest Jiaotong University), Lingfeng Qu (Southwest Jiaotong University), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract:

Color image Reversible Data Hiding (RDH) is getting more and more important since the number of its applications is steadily growing. This paper proposes an efficient color image RDH scheme based on pixel value ordering (PVO), in which the channel correlation is fully utilized to improve the embedding performance. In the proposed method, the channel correlation is used in the overall process of data embedding, including prediction stage, block selection and capacity allocation. In the prediction stage, since the pixel values in the co-located blocks in different channels are monotonically consistent, the large pixel values are collected preferentially by pre-sorting the intra-block pixels. This can effectively improve the embedding capacity of RDH based on PVO. In the block selection stage, the description accuracy of block complexity value is improved by exploiting the texture similarity between the channels. The smoothing the block is then preferentially used to reduce invalid shifts. To achieve low complexity and high accuracy in capacity allocation, the proportion of the expanded prediction error to the total expanded prediction error in each channel is calculated during the capacity allocation process. The experimental results show that the proposed scheme achieves significant superiority in fidelity over a series of state-of-the-art schemes. For example, the PSNR of the Lena image reaches 62.43dB, which is a 0.16dB gain compared to the best results in the literature with a 20,000bits embedding capacity.

Keywords—Reversible data hiding, color image, pixel value ordering, channel correlation

Thursday, April 6, 2023

VCD: Video Complexity Dataset

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022) Open Dataset and Software (ODS) track

June 14–17, 2022 | Athlone, Ireland

Conference Website
[PDF]

Hadi Amirpour, Vignesh V Menon, Samira Afzal, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: This paper provides an overview of the open Video Complexity Dataset (VCD) which comprises 500 Ultra High Definition (UHD) resolution test video sequences. These sequences are provided at 24 frames per second (fps) and stored online in losslessly encoded 8-bit 4:2:0 format. In this paper, all sequences are characterized by spatial and temporal complexities, rate-distortion complexity, and encoding complexity with the x264 AVC/H.264 and x265 HEVC/H.265 video encoders. The dataset is tailor-made for cutting-edge multimedia applications such as video streaming, two-pass encoding, per-title encoding, scene-cut detection, etc. Evaluations show that the dataset includes diversity in video complexities. Hence, using this dataset is recommended for training and testing video coding applications. All data have been made publicly available as part of the dataset, which can be used for various applications.

The details of VCD can be accessed online at https://vcd.itec.aau.at.

Wednesday, April 5, 2023

LEADER: A Collaborative Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

IEEE International Conference on Communications (ICC)

May 16–20, 2022 | Seoul, South Korea

Conference Website
[PDF][Slides][Video]

Reza Farahani (Alpen-Adria-Universität Klagenfurt), Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt).

Abstract: With the emerging demands of high-definition and low-latency video streams, HTTP Adaptive Streaming (HAS) is considered the principal video delivery technology over the Internet. Network-assisted video streaming schemes, which employ modern networking paradigms, e.g., Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing, have been introduced as promising complementary solutions in the HAS context to improve users' Quality of Experience (QoE) as well as network utilization. However, the existing network-assisted HAS schemes have not fully used edge collaboration techniques and SDN capabilities for achieving the aforementioned aims. To bridge this gap, this paper introduces a coLlaborative Edge- and SDN-Assisted framework for HTTP aDaptive vidEo stReaming (LEADER). In LEADER, the SDN controller collects various information items and runs a central optimization model that minimizes the HAS clients' serving time, subject to the network's and edge servers' resource constraints. Due to the NP-completeness and impractical overheads of the central optimization model, we propose an online distributed lightweight heuristic approach consisting of two phases that runs over the SDN controller and edge servers, respectively. We implement the proposed framework, conduct our experiments on a large-scale testbed including 250 HAS players, and compare its effectiveness with other strategies. The experimental results demonstrate that LEADER outperforms baseline schemes in terms of both users' QoE and network utilization, by at least 22% and 13%, respectively.

Keywords:

Dynamic Adaptive Streaming over HTTP (DASH), Network-Assisted Video Streaming, Video Transcoding, Quality of Experience (QoE), Software-Defined Networking (SDN), Network Function Virtualization (NFV), Edge Computing, Edge Collaboration.

IEEE ICC'22_ LEADER_ A Collaborative Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming.pdf from Reza Farahani

Tuesday, April 4, 2023

EMES: Efficient Multi-Encoding Schemes for HEVC-based Adaptive Bitrate Streaming

Transactions on Multimedia Computing Communications and Applications (TOMM)

Journal Website

[PDF]

Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: In HTTP Adaptive Streaming (HAS), videos are encoded at multiple bitrates and spatial resolutions (i.e., representations) to adapt to the heterogeneity of network conditions, device attributes, and end-user preferences. Encoding the same video segment at multiple representations increases costs for content providers. State-of-the-art multi-encoding schemes improve the encoding process by utilizing encoder analysis information from already encoded representation(s) to reduce the encoding time of the remaining representations. These schemes typically use the highest bitrate representation as the reference to accelerate the encoding of the remaining representations. Nowadays, most streaming services utilize cloud-based encoding techniques, enabling a fully parallel encoding process to reduce the overall encoding time. The highest bitrate representation has the highest encoding time than the other representations. Thus, utilizing it as the reference encoding is unfavorable in a parallel encoding setup as the overall encoding time is bound by its encoding time. This paper provides a comprehensive study of various multi-rate and multi-encoding schemes in both serial and parallel encoding scenarios. Furthermore, it introduces novel heuristics to limit the Rate Distortion Optimization (RDO) process across various representations. Based on these heuristics, three multi-encoding schemes are proposed, which rely on encoder analysis sharing across different representations: (i) optimized for the highest compression efficiency, (ii) optimized for the best compression efficiency-encoding time savings trade-off, and (iii) optimized for the best encoding time savings. Experimental results demonstrate that the proposed multi-encoding schemes (i), (ii), and (iii) reduce the overall serial encoding time by 34.71%, 45.27%, and 68.76% with a 2.3%, 3.1%, and 4.5% bitrate increase to maintain the same VMAF, respectively compared to stand-alone encodings. The overall parallel encoding time is reduced by 22.03%, 20.72%, and 76.82% compared to stand-alone encodings for schemes (i), (ii), and (iii), respectively.

An example of video representations’ storage in HAS. The input video is encoded at multiple resolutions and bitrates. Novel multi-rate and multi-resolution encoder analysis sharing methods are presented to accelerate encoding in more than one representation.

Monday, April 3, 2023

IEEE TCSVT: DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT)

Journal Website

[PDF]

Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria

Abstract: In HTTP Adaptive Streaming (HAS), each video is divided into smaller segments, and each segment is encoded at multiple pre-defined bitrates to construct a bitrate ladder. To optimize bitrate ladders, per-title encoding approaches encode each segment at various bitrates and resolutions to determine the convex hull. From the convex hull, an optimized bitrate ladder is constructed, resulting in an increased Quality of Experience (QoE) for end-users. With the ever-increasing efficiency of deep learning-based video enhancement approaches, they are more and more employed at the client-side to increase the QoE, specifically when GPU capabilities are available. Therefore, scalable approaches are needed to support end-user devices with both CPU and GPU capabilities (denoted as CPU-only and GPU-available end-users, respectively) as a new dimension of a bitrate ladder. To address this need, we propose DeepStream, a scalable content-aware per-title encoding approach to support both CPU-only and GPU-available end-users. (i) To support backward compatibility, DeepStream constructs a bitrate ladder based on any existing per-title encoding approach. Therefore, the video content will be provided for legacy end-user devices with CPU-only capabilities as a base layer (BL). (ii) For high-end end-user devices with GPU capabilities, an enhancement layer (EL) is added on top of the base layer comprising lightweight video super-resolution deep neural networks (DNNs) for each bitrate-resolution pair of the bitrate ladder. A content-aware video super-resolution approach leads to higher video quality, however, at the cost of bitrate overhead. To reduce the bitrate overhead for streaming content-aware video super-resolution DNNs, DeepCABAC, context-adaptive binary arithmetic coding for DNN compression, is used. Furthermore, the similarity among (i) segments within a scene and (ii) frames within a segment are used to reduce the training costs of DNNs.
Experimental results show bitrate savings of 34% and 36% to maintain the same PSNR and VMAF, respectively, for GPU-available end-users, while the CPU-only users get the desired video content as usual.

Keywords—HTTP adaptive streaming, per-title encoding, video streaming, video super-resolution

Monday, March 27, 2023

ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming

2022 IEEE International Conference on Image Processing (ICIP)

October 16-19, 2022 | Bordeaux, France

Conference Website

[PDF][Slides][Video]

Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: In two-pass encoding, also known as multi-pass encoding, the input video content is analyzed in the first-pass to help the second-pass encoding utilize better encoding decisions and improve overall compression efficiency. In live streaming applications, a single-pass encoding scheme is mainly used to avoid the additional first-pass encoding run-time to analyze the complexity of every video content. This paper introduces an Efficient low-latency Two-Pass encoding Scheme (ETPS) for live video streaming applications. In this scheme, Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for every video segment are extracted in the first-pass to predict each target bitrate’s optimal constant rate factor (CRF) for the second-pass constrained variable bitrate (cVBR) encoding. Experimental results show that, on average, ETPS compared to a traditional two-pass average bitrate encoding scheme yields encoding time savings of 43.78% without any noticeable drop in compression efficiency. Additionally, compared to a single-pass constant bitrate (CBR) encoding, it yields bitrate savings of 10.89% and 8.60% to maintain the same PSNR and VMAF, respectively.

ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming from Alpen-Adria-Universität

Friday, March 24, 2023

Light-weight Video Encoding Complexity Prediction using Spatio Temporal Features

2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP)

September 26-28, 2022 | Shanghai, China

Conference Website

[PDF]

Hadi Amirpour*, Prajit T Rajendran (Universite Paris-Saclay, Paris, France), Vignesh V Menon*, Mohammad Ghanbari*, and Christian Timmerer*
* ... Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: The increasing demand for high-quality and low-cost video streaming services calls for predicting video encoding complexity. The prior prediction of video encoding complexity including encoding time and bitrate predictions are used to allocate resources and set optimized parameters for video encoding effectively. In this paper, a light-weight video encoding complexity prediction (VECP) scheme that predicts the encoding bitrate and the encoding time of video with high accuracy is proposed. Firstly, low-complexity Discrete Cosine Transform (DCT)-energy-based features, namely spatial complexity, temporal complexity, and brightness of videos are extracted, which can efficiently represent the encoding complexity of videos. The latent vectors are also extracted from a Convolutional Neural Network (CNN) with MobileNet as the backend to obtain additional features from representative frames of each video to assist the prediction process. The extreme gradient boosting (XGBoost) regression algorithm is deployed to predict video encoding complexity using the extracted features. The experimental results demonstrate that VECP predicts the encoding bitrate with an error percentage of up to 3.47% and encoding time with an error percentage of up to 2.89%, but with a significantly low overall latency of 3.5 milliseconds per frame which makes it suitable for both Video on Demand (VoD) and live streaming applications.

Saturday, March 4, 2023

CADLAD: Device-aware Bitrate Ladder Construction for HTTP Adaptive Streaming

18th International Conference on Network and Service Management (CNSM 2022)

Thessaloniki, Greece | 31 October - 4 November 2022

Conference Website

[PDF][Slides]

Minh Nguyen (Alpen-Adria-Universität Klagenfurt, Austria), Babak Taraghi (Alpen-Adria-Universität Klagenfurt, Austria), Abdelhak Bentaleb (National University of Singapore, Singapore), Roger Zimmermann (National University of Singapore, Singapore), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: Considering network conditions, video content, and viewer device type/screen resolution to construct a bitrate ladder is necessary to deliver the best Quality of Experience (QoE). A large-screen device like a TV needs a high bitrate with high resolution to provide good visual quality, whereas a small one like a phone requires a low bitrate with low resolution. In addition, encoding high-quality levels at the server side while the network is unable to deliver them causes unnecessary cost for the content provider. Recently, the Common Media Client Data (CMCD) standard has been proposed, which defines the data that is collected at the client and sent to the server with its HTTP requests. This data is useful in log analysis, quality of service/experience monitoring and delivery improvements.

In this paper, we introduce a CMCD-Aware per-Device bitrate LADder construction (CADLAD) that leverages CMCD to address the above issues. CADLAD comprises components at both client and server sides. The client calculates the top bitrate (tb) — a CMCD parameter to indicate the highest bitrate that can be rendered at the client — and sends it to the server together with its device type and screen resolution. The server decides on a suitable bitrate ladder, whose maximum bitrate and resolution are based on CMCD parameters, to the client device with the purpose of providing maximum QoE while minimizing delivered data. CADLAD has two versions to work in Video on
Demand (VoD) and live streaming scenarios. Our CADLAD is client agnostic; hence, it can work with any players and ABR algorithms at the client. The experimental results show that CADLAD is able to increase the QoE by 2.6x while saving 71% of delivered data, compared to an existing bitrate ladder of an available video dataset. We implement our idea within CAdViSE — an open-source testbed for reproducibility.

cadlad

CADLAD: Device-aware Bitrate Ladder Construction for HTTP Adaptive Streaming from Minh Nguyen

Friday, March 3, 2023

Low Latency Live Streaming Implementation in DASH and HLS

ACM Multimedia Conference - OSS Track

Lisbon, Portugal | 10-14 October 2022

[PDF]

Abdelhak Bentaleb (National University of Singapore), Zhengdao Zhan (National University of Singapore), Farzad Tashtarian (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), May Lim (National University of Singapore), Saad Harous (University of Sharjah), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hermann Hellwagner (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Roger Zimmermann (National University of Singapore)

Abstract: Low latency live streaming over HTTP using Dynamic Adaptive Streaming over HTTP (LL-DASH) and HTTP Live Streaming} (LL-HLS) has emerged as a new way to deliver live content with respectable video quality and short end-to-end latency. Satisfying these requirements while maintaining viewer experience in practice is challenging, and adopting conventional adaptive bitrate (ABR) schemes directly to do so will not work. Therefore, recent solutions including LoL+, L2A, Stallion, and Llama re-think conventional ABR schemes to support low-latency scenarios. These solutions have been integrated with dash.js that support LL-DASH. However, their performance in LL-HLS remains in question. To bridge this gap, we implement and integrate existing LL-DASH ABR schemes in the hls.js video player which supports LL-HLS. Moreover, a series of real-world trace-driven experiments have been conducted to check their efficiency under various network conditions including a comparison with results achieved for LL-DASH in dash.js.

Thursday, March 2, 2023

LFC-SASR: Light Field Coding Using Spatial and Angular Super-Resolution

ICME Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience (ICMEW)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

[PDF]

Ekrem Çetinkaya, Hadi Amirpour, and Christian Timmerer
Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information about the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16 % and 53.41 % to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high resolution.

Keywords: Light field, Compression, Super-resolution, VVC.

Sunday, November 6, 2022

ARARAT: A Collaborative Edge-Assisted Framework for HTTP Adaptive Video Streaming

IEEE Transactions on Network and Service Management (TNSM)

Journal Website

[PDF]

Reza Farahani (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Mohammad Shojafar (University of Surrey, UK), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Farzad Tashtarian (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Mohammad Ghanbari (University of Essex, UK), and Hermann Hellwagner (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: With the ever-increasing demands for high-definition and low-latency video streaming applications, network-assisted video streaming schemes have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context to improve users’ Quality of Experience (QoE) as well as network utilization. Edge computing is considered one of the leading networking paradigms for designing such systems by providing video processing and caching close to the end-users. Despite the wide usage of this technology, designing network-assisted HAS architectures that support low-latency and high-quality video streaming, including edge collaboration is still a challenge. To address these issues, this article leverages the Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing paradigms to propose A collaboRative edge-Assisted framewoRk for HTTP Adaptive video sTreaming (ARARAT). Aiming at minimizing HAS clients’ serving time and network cost, besides considering available resources and all possible serving actions, we design a multi-layer architecture and formulate the problem as a centralized optimization model executed by the SDN controller. However, to cope with the high time complexity of the centralized model, we introduce three heuristic approaches that produce near-optimal solutions through efficient collaboration between the SDN controller and edge servers. Finally, we implement the ARARAT framework, conduct our experiments on a large-scale cloud-based testbed including 250 HAS players, and compare its effectiveness with state-of-the-art systems within comprehensive scenarios. The experimental results illustrate that the proposed ARARAT methods (i) improve users’ QoE by at least 47%, (ii) decrease the streaming cost, including bandwidth and computational costs, by at least 47%, and (iii) enhance network utilization, by at least 48% compared to state-of-the-art approaches.

Index Terms—HTTP Adaptive Streaming (HAS), Network-Assisted Video Streaming, Software-Defined Networking (SDN), Network Function Virtualization (NFV), Edge Computing, Edge Collaboration, Video Transcoding.

Tuesday, November 1, 2022

DoFP+: An HTTP/3-based Adaptive Bitrate Approach Using Retransmission Techniques

DoFP+: An HTTP/3-based Adaptive Bitrate Approach Using Retransmission Techniques
IEEE Access
[PDF]

Minh Nguyen, Daniele Lorenzi, Farzad Tashtarian, Hermann Hellwagner, Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

(*) Minh Nguyen and Daniele Lorenzi contributed equally to this work

Abstract: HTTP Adaptive Streaming (HAS) solutions use various adaptive bitrate (ABR) algorithms to select suitable video qualities with the objective of coping with the variations of network connections. HTTP has been evolving with various versions and provides more and more features. Most of the existing ABR algorithms do not significantly benefit from the HTTP development when they are merely supported by the most recent HTTP version. An open research question is “How can new features of the recent HTTP versions be used to enhance the performance of HAS?” To address this question, in this paper, we introduce Days of Future Past+ (DoFP+ for short), a heuristic algorithm that takes advantage of the features of the latest HTTP version, HTTP/3, to provide high Quality of Experience (QoE) to the viewers. DoFP+ leverages HTTP/3 features, including (i) stream multiplexing, (ii) stream priority, and (iii) request cancellation to upgrade low-quality segments in the player buffer while downloading the next segment. The qualities of those segments are selected based on an objective function and throughput constraints. The objective function takes into account two factors, namely the (i) average bitrate and the (ii) video instability of the considered set of segments. We also examine different strategies of download order for those segments to optimize the QoE in limited resources scenarios. The experimental results show an improvement in QoE by up to 33% while the number of stalls and stall duration for DoFP+ are reduced by 86% and 92%, respectively, compared to state-of-the-art ABR schemes. In addition, DoFP+ saves on average up to 16% downloaded data across all test videos. Also, we find that downloading segments sequentially brings more benefits for retransmissions than concurrent downloads; and lower-quality segments should be upgraded before other segments to gain more QoE improvement. Our source code has been published for reproducibility at https://github.com/cd-athena/DoFP-Plus.

Keywords: HTTP/3, ABR algorithm, QoE, HAS, DASH

Saturday, October 29, 2022

Perceptually-aware Per-title Encoding for Adaptive Video Streaming

2022 IEEE International Conference on Multimedia and Expo (ICME)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

[PDF][Slides][Video]

Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as bitrate ladder) is used for simplicity and efficiency in order to avoid the additional encoding run-time required to find optimum resolution-bitrate pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces a perceptually-aware per-title encoding (PPTE) scheme for video streaming applications. In this scheme, optimized bitrate-resolution pairs are predicted online based on Just Noticeable Difference (JND) in quality perception to avoid adding perceptually similar representations in the bitrate ladder. To this end, Discrete Cosine Transform(DCT)-energy-based low-complexity spatial and temporal features for each video segment are used. Experimental results show that, on average, PPTE yields bitrate savings of 16.47% and 27.02% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming accompanied by a 30.69% cumulative decrease in storage space for various representations.

Perceptually-aware Per-title Encoding for Adaptive Video Streaming from Alpen-Adria-Universität

Friday, October 28, 2022

OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming

2022 IEEE International Conference on Multimedia and Expo (ICME)
Industry & Application Track

July 18-22, 2022 | Taipei, Taiwan

Conference Website

[PDF][Slides][Video]

Vignesh V Menon, Hadi Amirpour, Christian Feldmann (Bitmovin, Austria), Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as a bitrate ladder) is used during the entire streaming session in order to avoid the additional latency to find scene transitions and optimized bitrate-resolution pairs for every video content. However, an optimized bitrate ladder per scene may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces an Online Per-Scene Encoding (OPSE) scheme for adaptive HTTP live streaming applications. In this scheme, scene transitions and optimized bitrate-resolution pairs for every scene are predicted using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features. Experimental results show that, on average, OPSE yields bitrate savings of up to 48.88% in certain scenes to maintain the same VMAF, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming.

The bitrate ladder prediction envisioned using OPSE

OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming from Alpen-Adria-Universität

Thursday, October 27, 2022

Quality Optimization of Live Streaming Services over HTTP with Reinforcement Learning

IEEE Global Communications Conference 2021
7-11 December 2021 // Madrid, Spain // Hybrid: In-Person and Virtual Conference
Connecting Cultures around the Globe
https://globecom2021.ieee-globecom.org/

[PDF][Slides][Video]

F. Tashtarian*, R. Falanji‡, A. Bentaleb+, A. Erfanian*, P. S. Mashhadi§,
C. Timmerer*, H. Hellwagner*, R. Zimmermann+
Christian Doppler Laboratory ATHENA, Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Austria*
Department of Mathematical Science, Sharif University of Technology, Tehran, Iran‡
Department of Computer Science, School of Computing, National University of Singapore (NUS)+
Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Sweden§

Abstract: Recent years have seen tremendous growth in HTTP adaptive live video traffic over the Internet. In the presence of highly dynamic network conditions and diverse request patterns, existing yet simple hand-crafted heuristic approaches for serving client requests at the network edge might incur a large overhead and significant increase in time complexity. Therefore, these approaches might fail in delivering acceptable Quality of Experience (QoE) to end users. To bridge this gap, we propose ROPL, a learning-based client request management solution at the edge that leverages the power of the recent breakthroughs in deep reinforcement learning, to serve requests of concurrent users joining various HTTP-based live video channels. ROPL is able to react quickly to any changes in the environment, performing accurate decisions to serve clients requests, which results in achieving satisfactory user QoE. We validate the efficiency of ROPL through trace-driven simulations and a real-world setup. Experimental results from real-world scenarios confirm that ROPL outperforms existing heuristic-based approaches in terms of QoE, with a factor up to 3.7×.

Index Terms—Network Edge; Request Serving; HTTP Live Streaming; Low Latency; QoE; Deep Reinforcement Learning.

Quality Optimization of Live Streaming Services over HTTP with Reinforcement Learning from Alpen-Adria-Universität

Monday, October 24, 2022

Low Latency Live Streaming Implementation in DASH and HLS

ACM Multimedia Conference - OSS Track

Lisbon, Portugal | 10-14 October 2022

[PDF]

Monday, October 3, 2022

Efficient Content-Adaptive Feature-based Shot Detection for HTTP Adaptive Streaming

Efficient Content-Adaptive Feature-based Shot Detection for HTTP Adaptive Streaming

IEEE International Conference on Image Processing (ICIP)

September 19-22, 2021, Alaska, USA.

https://2021.ieeeicip.org

[PDF][Video][Poster]

Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt

Abstract: Video delivery over the Internet has become a commodity in recent years, owing to the widespread use of DASH. The DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of segments. This paper focuses on segmenting video into multiple shots for encoding in VoD HAS applications. This paper proposes a novel DCT feature-based shot detection and successive elimination algorithm for shot detection algorithm and benchmark the algorithm against the default shot detection algorithm of the x265 implementation of the HEVC standard. Our experimental results demonstrate that the proposed feature-based pre-processor has a recall rate of 25% and an F-measure of 20% greater than the benchmark algorithm for shot detection.

Keywords: HTTP Adaptive Streaming, Video-on-Demand, Shot detection, multi-shot encoding.

Pages

Monday, May 22, 2023

Friday, May 19, 2023

Advanced Scalability for Light Field Image Coding

IEEE Transactions on Image Processing (TIP)

Tuesday, April 11, 2023

IEEE Global Communications Conference (GLOBECOM)

Friday, April 7, 2023

Reversible Data Hiding for Color Images Based on Pixel Value Order of Overall Process Channel Correlation

Elsevier Signal Processing

Thursday, April 6, 2023

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022) Open Dataset and Software (ODS) track

Wednesday, April 5, 2023

LEADER: A Collaborative Edge- and SDN-Assisted Framework for HTTP Adaptive Video Streaming

Tuesday, April 4, 2023

EMES: Efficient Multi-Encoding Schemes for HEVC-based Adaptive Bitrate Streaming

Monday, April 3, 2023

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

Monday, March 27, 2023

Friday, March 24, 2023

Saturday, March 4, 2023

[PDF][Slides]

Friday, March 3, 2023

Thursday, March 2, 2023

[PDF]

Sunday, November 6, 2022

Tuesday, November 1, 2022

Saturday, October 29, 2022

Friday, October 28, 2022

Thursday, October 27, 2022

Monday, October 24, 2022

Low Latency Live Streaming Implementation in DASH and HLS

Monday, October 3, 2022