Logo image
BiSign-Net: Fine-grained Static Sign Language Recognition based on Bilinear CNN
Conference proceeding

BiSign-Net: Fine-grained Static Sign Language Recognition based on Bilinear CNN

Arezoo Sadeghzadeh and Md Baharul Islam
2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp.1-4
2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) (Penang, Malaysia, 2022–2023)
11-22-2022

Abstract

Assistive technologies bilinear CNN Communication systems Convolutional neural networks Feature extraction fine-grained classification Gesture recognition normalization outer product Robustness sign language recognition Signal Processing
Sign language (SL) is a type of communication language used by deaf and hard-of-hearing people. Large varieties in different SLs and lack of knowledge in general public to interpret them bring an inevitable necessity for breaking down the communication barriers by automatic sign language recognition (SLR) systems. Despite the existence of numerous approaches with satisfactory performance, they still suffer from severe challenges in dealing with large intra-class and slight inter-class variations, which make them infeasible for real-world applications. To address this issue, a novel end-to-end fine-grained static SLR (SSLR) system is proposed, namely BiSign-Net, based on Bilinear Convolutional Neural Network (Bi-CNN) to efficiently model the variations both in the location and appearance of the hands in the images for enhancing the accuracy, speed, and robustness against the translation. To this end, fine-grained orderless bilinear features are generated by pooled outer product of the extracted features from two identical novel CNN-based feature extractors. Bilinear features pass a normalization module including the signed square root and l 2 normalization through which the accuracy of the model is further improved. A dropout layer is deployed in the classification module to aid the model in dealing with small-scale datasets by preventing overfitting. The number of layers, hyper-parameters, and optimization technique of the proposed CNN are adjusted to achieve high performance and faster convergence with low number of parameters. Experimental results on four datasets of Static ASL, NUS I, Massey, and ArASL from two SLs (i.e. American and Arabic) with an accuracy of 100%, 100%, 99.20%, and 99.35%, respectively, demonstrate that the proposed model surpasses the existing approaches with high robustness and generalization ability.
url
Link to published article.View

Related links

Metrics

13 Record Views

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#10 Reduced Inequalities

Source: SDGs in the Output

Logo image