Logo image
Consistent Video Inpainting Using Axial Attention-Based Style Transformer
Journal article   Peer reviewed

Consistent Video Inpainting Using Axial Attention-Based Style Transformer

Masum Shah Junayed and Md Baharul Islam
IEEE transactions on multimedia, Vol.25, pp.7494-7504
01-01-2023

Abstract

Computer Science, Information Systems Computer Science, Software Engineering Science & Technology Computer Science Technology Telecommunications
Maintaining spatial and temporal consistency in the inpainted video area of the video is a challenging problem. Recent research focuses on flow information for synthesizing temporally smooth pixels while neglecting semantic structural coherence across the video frames. Thus, it suffers from over-smoothing and shadowy outlines that significantly degrade the inpainted video quality. We propose an end-to-end consistent video inpainting model that will substantially improve the inpainted video region to overcome this problem. The model employs a deep encoder (DE), axial attention block (AAB), style transformer, and decoder to enhance video inpainting with a realistic structure. A deep encoder (DE) encodes features effectively while the axial attention block (AAB) recreates all retrieved attributes by merging recoverable multi-scale characteristics with local spatial structures. Then, a novel-style transformer with the style manipulation block (SMB) fills the missing area with rich visual elements and temporal coherence. We use two publicly available benchmark datasets to assess the model's performance. Experimental results demonstrate that our method performs better than the state-of-the-art methods by a large margin. Besides, an extensive ablation study validates the model's performance.
url
Link to published article.View

Related links

Metrics

Details

Logo image