Abstract
Developing a real-time sign language recognition and translation (SLRT) system is to address the communication barriers faced by the Deaf and Hard-of-Hearing (DHH) community in the workplace. It involves deep learning models, including CNNs and RNNs, and tools like MediaPipe, gTTS, MarianMT, and FastText that will improve translation efficiency and SLRT accuracy. Our dataset comprises two ASL image datasets with diverse backgrounds to ensure that the outcome model can operate effectively in various environments. The ResNet-LSTM model we implemented in our SLRT system has achieved an accuracy of \mathbf{9 9. 9 5 \%} , proving its robustness in SLRT. In terms of our quantitative assessment of the deep learning model, we also conduct assessments with XAI, including LIME, T-SNE, saliency maps, etc. The proposed SLRT system can handle real-time translation with minimal computational resources, ensuring the developed SLRT has high practicality and scalability. At the same time, with the integration of NLP features and the speech-to-text function, our SLRT makes daily interaction highly efficient and easy to use.