Abstract
Existing learning-based video compression methods still face challenges
related to inaccurate motion estimates and inadequate motion compensation
structures. These issues result in compression errors and a suboptimal
rate-distortion trade-off. To address these challenges, this work presents an
end-to-end video compression method that incorporates several key operations.
Specifically, we propose an autoencoder-type network with a residual skip
connection to efficiently compress motion information. Additionally, we design
motion vector and residual frame filtering networks to mitigate compression
errors in the video compression system. To improve the effectiveness of the
motion compensation network, we utilize powerful nonlinear transforms, such as
the Parametric Rectified Linear Unit (PReLU), to delve deeper into the motion
compensation architecture. Furthermore, a buffer is introduced to fine-tune the
previous reference frames, thereby enhancing the reconstructed frame quality.
These modules are combined with a carefully designed loss function that
assesses the trade-off and enhances the overall video quality of the decoded
output. Experimental results showcase the competitive performance of our method
on various datasets, including HEVC (sequences B, C, and D), UVG, VTL, and
MCL-JCV. The proposed approach tackles the challenges of accurate motion
estimation and motion compensation in video compression, and the results
highlight its competitive performance compared to existing methods.