Logo image
Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition
Conference proceeding

Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition

Arezoo Sadeghzadeh and Md Baharul Islam
2022 Innovations in Intelligent Systems and Applications Conference (ASYU), pp.1-6
IEEE Conference on the Innovations in Intelligent Systems and Applications (Antalya, Turkey, 2022–2022)
09-07-2022

Abstract

CNN Degradation feature embedding Gesture recognition Image resolution semi-hard triplet loss static sign language recognition Support vector machines SVM Technological innovation Training Visualization
Sign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility.
url
Link to published article.View

Related links

Metrics

14 Record Views

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#10 Reduced Inequalities

Source: SDGs in the Output

Logo image