Scholarship list
Book chapter
Assistive Visual Tool: Enhancing Safe Navigation with Video Remapping in AR Headsets
Published 05-12-2025
Computer Vision – ECCV 2024 Workshops, 15634, 356 - 371
Visual Field Loss (VFL) is characterized by blind spots or scotomas that poses detrimental impact on fundamental movement activities of individuals. Addressing the challenges (e.g., low video quality, content loss, high levels of contradiction, and limited mobility assessment) faced by existing Extended Reality (XR) systems as vision aids, we introduce a groundbreaking method that enriches the real-time navigation using Augmented Reality (AR) glasses. Our novel vision aid employs advanced video processing techniques to enhance visual perception in individuals with moderate to severe VFL, bridging the gap to healthy vision. A unique optimal video remapping function, tailored to our selected AR glasses characteristics, dynamically maps live video content to the largest intact region of the Visual Field (VF) map. Our method preserves video quality, minimizing blurriness and distortion. Through a comprehensive empirical user study involving 29 subjects with artificially induced scotomas, statistical analyses of object counting and multi-tasking walking track tests demonstrate the promising performance of our method in enhancing visual awareness and navigation capability in real-time.
Book chapter
Forecasting Wearing-Off in Parkinson's Disease: An Ensemble Learning Approach Using Wearable Data
Published 2025
Activity, Behavior, and Healthcare Computing, 324 - 334
Parkinson's disease (PD) is a neurodegenerative disorder that affects both non-motor and motor functions as the disease progresses. Patients with PD experience a phenomenon known as "wearing-off," where symptoms re-emerge before the next scheduled medicine intake, leading to discomfort. Consequently, it is crucial for PD patients and clinicians to closely monitor and document changes in symptoms to ensure appropriate treatment. In this study, we propose an ensemble learning approach that utilizes wearable data to predict PD patients' wear-off. Therefore, medical practitioners can devise tailored treatment strategies to effectively manage Parkinson's disease and its associated symptoms. Our experiments involved a combination of ensemble machine learning models (Random Forest, Support Vector Machine, and XGBClassifier) along with two deep learning-based models (convolutional neural network and long short-term memory), resulting in an impressive accuracy of approximately 93.2% . Code is available at : {https://github.com/mdhosen/Parkinson-detection}.
Book chapter
Published 2025
Activity, Behavior, and Healthcare Computing, 208 - 219
The rise in global temperatures has become a significant concern, leading to an increase in heat stroke incidents, which pose severe health consequences, including mortality. The comfort levels of indoor environments fluctuate depending on various activities performed in different situations. Physiological data, encompassing heart rate, body temperature, and blood pressure, provide valuable insights into the identification of patterns and trends that may signify an elevated risk of heatstroke. However, manual analysis of such data proves impractical due to its complexity and volume. In this chapter, we present an energy-efficient machine learning-based approach to forecast individual thermal comfort sensations, enabling the early identification of individuals at risk of heatstroke before symptom manifestation. We conducted experiments using four distinct machine learning models along with one deep learning-based model, achieving an accuracy of approximately 99% on test set.
Book chapter
Static Sign Language Recognition Using Segmented Images and HOG on Cluttered Backgrounds
Published 2024
Human Activity and Behavior Analysis, 23 - 45
Sign language (SL) is of great importance for hearing-impaired and deaf community as their primary communication means. Large variations in the available SLs around the world bring an inevitable necessity for automatic SL interpretation systems to attenuate the communication barrier between the deaf and general public. Despite the existence of numerous innovative studies in this domain, providing an efficient highly accurate system for real-world applications is still challenging especially in the presence of complex backgrounds, low inter-class and large intra-class variations, and changes in illumination conditions. To address these issues, a novel Convolutional Neural Network (CNN)-based static sign language recognition (SLR) system is proposed by gaining the maximum benefits from the segmented hand images and Histogram of Oriented Gradients (HOG) handcrafted features. To this end, a U-Net architecture is trained by a small-scale annotated SL dataset for hand segmentation, which is then successfully applied to the other non-annotated datasets to mitigate the detrimental effects of the complex backgrounds. The robustness of the system against environmental and user-dependent variations is further improved, taking advantage of HOG handcrafted features extracted from the segmented images in the form of 2D images. These generated images are fed into our proposed CNN model whose number of layers and filters, kernel sizes, activation functions, optimization method, learning rate, and regularization techniques are properly selected so that the performance accuracy is maximized. Extensive experiments conducted on three different American Sign Language (ASL) datasets with variations in background and lighting, i.e., MU HandImages ASL (Massey), NUSII, and Static Hand Gesture ASL, with an accuracy of 99.71%, 99.50%, and 100% demonstrate the robustness, superiority and high capabilities of our proposed system over the existing approaches.
Book chapter
T-SignSys: An Efficient CNN-Based Turkish Sign Language Recognition System
Published 12-23-2023
Advanced Engineering, Technology and Applications, 226 - 241
Sign language (SL) is a communication tool playing a crucial role in facilitating the daily life of deaf or hearing-impaired people. Large varieties in the existing SLs and lack of interpretation knowledge in the general public lead to a communication barrier between the deaf and hearing communities. This issue has been addressed by automated sign language recognition (SLR) systems, mostly proposed for American Sign Language (ASL) with limited number of research studies on the other SLs. Consequently, this paper focuses on static Turkish Sign Language (TSL) recognition for its alphabets and digits by proposing an efficient novel Convolutional Neural Network (CNN) model. Our proposed CNN model comprises 9 layers, of which 6 layers are employed for feature extraction, and the remaining 3 layers are adopted for classification. The model is prevented from overfitting while dealing with small-scale datasets by benefiting from two regularization techniques: 1) ignoring a specified portion of neurons during training by applying a dropout layer, and 2) applying penalties during loss function optimization by employing L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_2$$\end{document} kernel regularizer in the convolution layers. The arrangement of the layers, learning rate, optimization technique, model hyper-parameters, and dropout layers are carefully adjusted so that the proposed CNN model can recognize both TSL alphabets and digits fast and accurately. The feasibility of our proposed T-SignSys is investigated through a comprehensive ablation study. Our model is evaluated on two datasets of TSL alphabets and digits with an accuracy of 97.85% and 99.52%, respectively, demonstrating its competitive performance despite straightforward implementation.
Book chapter
Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module
Published 01-01-2022
DIGITIZING PRODUCTION SYSTEMS, ISPR2021, 39 - 50
Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.
Book chapter
Published 10-26-2018
Advances in Computing and Data Sciences, 269 - 278
This paper is based on the theme of employee attrition where the reasoning behind employee turnover has predicted with the help of machine learning approach. As employee turnover has become a vital issue these days due to heavy work pressure, less salary, less work satisfaction, poor working environment; it’s high time to uphold a better solution on this term. Therefore, we have come up with a prediction model based on machine learning approach where we have used each feature’s respective Random Forest importance weights while threshold based correlated feature merging into each of the single combined variable. Again, we scale specific features to get the correlated matrix of features matrix by defining threshold. Certainly, this newly developed technique has achieved good result for some algorithms compared to Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for the same dataset.
Book chapter
Semantics-Preserving Warping for Stereoscopic Image Retargeting
Published 01-01-2016
IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, 9431, 257 - 268
Due to availability and popularity of stereoscopic displays in the recent years, research into stereo image retargeting is receiving considerable attention. In this paper, we extend the tearable image warping method for stereo image retargeting. Our method retargets both the left and right image of the stereo image pair simultaneously to preserve scene consistency, and minimize distortion using a global optimization algorithm. It is also able to preserve stereoscopic properties of the resulting stereo image. Experimental results show that our approach can preserve the global image context better than stereoscopic cropping, preserve structural details better than stereoscopic seam carving, and protect objects better than stereoscopic traditional warping. Besides, compared to scene warping, our approach can guarantee semantic connectedness.