Adaptive Video Streaming with Layered Neural Codecs for Both Machine and Human Vision
项目来源
项目主持人
项目受资助机构
立项年度
立项时间
项目编号
研究期限
项目级别
受资助金额
学科
学科代码
基金类别
关键词
参与者
参与机构
1.Leveraging Temporal Down-Sampling Structure and Spatio-Temporal Fusion for Efficient Video Coding.
- 关键词:
- deep learning; low-bitrate; video coding; video enhancement
- He, Keren;Gao, Yufei;Wang, Qi;Wang, Haixin;Zhou, Jinjia
- 《Sensors 》
- 2026年
- 26卷
- 5期
- 期刊
Down-sampling-based video compression frameworks have shown great potential in improving compression efficiency in modern sensing and imaging systems. However, existing methods ignore critical spatial and temporal redundancy, and treat all frames uniformly during down-sampling. This leads to the loss of important information and impacts compression efficiency. To address these limitations, this paper proposes a temporal down-sampling system, in which only intermediate frames are down-sampled while preserving key frames with high quality for reference. On the decoding side, we employ a frame-recurrent enhancement mechanism to maximize the use of temporal redundancy information. In the fusion of enhancement stage, we design a Multi-scale Temporal-Spatial Attention (MTSA) module. MTSA consists of two components: Multi-Temporal Attention (MTA) and Pyramid Spatial Attention (PSA). MTA performs multi-scale temporal correlation modeling, expanding the receptive field and providing stable cues in compressed regions. PSA integrates local spatial saliency and contextual structure in a progressive and multi-stage manner. Extensive experiments show that our approach achieves consistent BD-rate reductions. Under All-Intra, Low-Delay-P, and Random Access configurations, we observe BD-rate reductions of I, P, and B frames ranging from 14% to 39% compared to VVC, and outperform prior approaches anchored by the standard HEVC.
...2.On Demand Secure Scalable Video Streaming for Both Human and Machine Applications
- 关键词:
- Codes (symbols);Cryptography;Data privacy;Efficiency;Image communication systems;Man machine systems;Network security;Security systems;Video streaming;Deep video coding;Encrypted video streaming;Heterogeneous devices;High-efficiency video coding;Machine analysis;On demands;Scalable video streaming;Scalable video-coding;Video coding for machine;Video-streaming
- Zain, Alaa;Fan, Yibo;Zhou, Jinjia
- 《Sensors》
- 2026年
- 26卷
- 4期
- 期刊
Scalable video coding plays an essential role in supporting heterogeneous devices, network conditions, and application requirements in modern video streaming systems. However, most existing scalable coding approaches primarily optimize human perceptual quality and provide limited support for data privacy, as well as for machine analyses and the integration of heterogeneous sensor data. This limitation motivated the development of adaptive scalable video coding frameworks. The proposed approach is designed to serve both human viewers and automated analysis systems while ensuring high security and compression efficiency. The method adaptively encrypts selected layers during transmission to protect sensitive content without degrading decoding or analysis performance. Experimental evaluations on benchmark datasets demonstrate that the proposed framework achieves superior rate distortion efficiency and reconstruction quality, while also improving machine analysis accuracy compared to existing traditional and learning-based codes. In video surveillance scenarios, where the base layer is preserved for analysis, the proposed scalable human machine coding (SHMC) method outperforms scalable extensions of H.265/High Efficiency Video Coding (HEVC), Scalable High Efficiency Video Coding (SHVC), reducing the average bit-per-pixel (bpp) by 26.38%, 30.76%, and 60.29% at equivalent mean Average Precision (mAP), Peak Signal-to-Noise Ratio (PSNR), and Multi-Scale Structural Similarity (MS-SSIM) levels. These results confirm the effectiveness of integrating scalable video coding with intelligent encryption for secure and efficient video transmission. © 2026 by the authors.
...
