Logo image
A Hybrid Approach for retinal image super-resolution
Journal article   Open access   Peer reviewed

A Hybrid Approach for retinal image super-resolution

Alnur Alimanov, Md Baharul Islam and Nirase Fathima Abubacker
Biomedical engineering advances, Vol.6, p.100099
11-2023

Abstract

Adaptive patch embedding layer Convolutional neural network Locality self-Attention Retinal images Single image super-Resolution Vision transformer
•Propose a hybrid deep learning-based approach for retinal image super-resolution.•To keep model architecture more stable, we design an adaptive patch embedding layer.•Combine CNN & ViT to improve the model’s performance compared to each standalone module.•Utilize a structural loss to surpass the performance over the adversarial loss.•Conduct an extensive ablation study to validate the proposed method’s performance. Experts require large high-resolution retinal images to detect tiny abnormalities, such as microaneurysms or issues of vascular branches. However, these images often suffer from low quality (e.g., resolution) due to poor imaging device configuration and misoperations. Many works utilized Convolutional Neural Network-based (CNN) methods for image super-resolution. The authors focused on making these models more complex by adding layers and various blocks. It leads to additional computational expenses and obstructs the application in real-life scenarios. Thus, this paper proposes a novel, lightweight, deep-learning super-resolution method for retinal images. It comprises a Vision Transformer (ViT) encoder and a convolutional neural network decoder. To our best knowledge, this is the first attempt to use a transformer-based network to solve the issue of accurate retinal image super-resolution. A progressively growing super-resolution training technique is applied to increase the resolution of images by factors of 2, 4, and 8. The prominent architecture remains constant thanks to the adaptive patch embedding layer, which does not lead to additional computational expense due to increased up-scaling factors. This patch embedding layer includes 2-dimensional convolution with specific values of kernel size and strides that depend on the input shape. This strategy has removed the need to append additional super-resolution blocks to the model. The proposed method has been evaluated through quantitative and qualitative measures. The qualitative analysis also includes vessel segmentation of super-resolved and ground truth images. Experimental results indicate that the proposed method outperforms the current state-of-the-art methods.
url
Link to published article.View
Published (Version of record) Open

Related links

Metrics

22 Record Views

Details

Logo image