Enhancement of Recurrent Neural Networks (RNN) applied in hand gesture recognition for American Sign Language (ASL) alphabet recognition
Jenny R. Jimenez & Augustin Brain C. Sabordio
Abstract
This study focused on improving Recurrent Neural Network (RNN) algorithms to recognize American Sign Language (ASL) alphabets to better hand segmentation, feature extraction, and time modeling to obtain better accuracy and robustness in the real-time recognition. An experimental research design was employed using 260 video samples representing all 26 ASL letters under both ideal and challenging environmental conditions. The enhanced model integrates MediaPipe-based hand detection with adaptive preprocessing, multimodal feature extraction combining 3D landmarks and engineered articulation features (99-dimensional vectors), and adaptive temporal modeling using extended sequence buffering and prediction smoothing. Dual-stream neural architecture takes visual and numerical data of landmarks and processes them before being classified through LSTM layers and softmax output. The improved system achieved an overall accuracy of 97.70% and a mean confidence of 91.45%, which is 38.85 percentage points higher than the baseline model. Accuracy in challenging conditions was significantly improved, with a degradation rate of only 1.60% compared to 8.50% in the baseline. The recognition of visually similar letters reached 98% accuracy, while dynamic letters J and Z achieved relative improvements of 160% and 126%, respectively. The current research study is limited to the recognition of ASL alphabets (A–Z) in controlled experimental conditions. Future research may extend the system to full-word recognition and real-world deployment scenarios.
Keywords
3D landmark feature extraction, temporal modeling, human–computer interaction, MediaPipe
Author information & Contribution
Jenny R. Jimenez. Corresponding author. Bachelor of Science in Computer Science student, Pamantasan ng Lungsod ng Maynila. Email: jimenezjenny599@gmail.com
Augustin Brain C. Sabordio. Bachelor of Science in Computer Science Student, Pamantasan ng Lungsod ng Maynila. Email: austinbrain25@gmail.com
"All authors equally contributed to the conception, design, preparation, and analysis, and writing of the manuscript. All authors read and approved the final manuscript."
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
This work was not supported by any funding.
AI Declaration
The authors declare the use of Artificial Intelligence (AI) tools in the preparation of this paper. Specifically, the authors used Grammarly for grammar and spell checking, Scribbr for citation and reference formatting verification, and QuillBot for paraphrasing and language refinement. The authors take full responsibility in ensuring proper review and editing of all content generated or refined using these AI tools.
Notes
This paper has been presented in 3rd International Student Research Congress (ISRC).
Acknowledgement
The researcher would like to express sincere gratitude to the individuals who contributed their time, guidance, and expertise to the successful completion of this thesis.
Foremost appreciation is extended to Prof. Richard C. Regala, Thesis Adviser, for his invaluable guidance, constant encouragement, and insightful recommendations throughout the development of this study. His expertise and dedication greatly helped in shaping the direction and quality of this research.
Grateful acknowledgment is given to the Panel of Examiners, Prof. Mark Anthony S. Mercado and Prof. Marilou B. Mangrobang, for their time, effort, and meaningful suggestions during the oral examination. Their comments and recommendations significantly improved the clarity and substance of this study.
The researcher would also like to acknowledge Prof. Raymund M. Dioses, Chairperson of the Computer Science Department, for his support and leadership, as well as Dr. Khatalyn E. Mata, Dean of the College of Information Systems and Technology Management and Thesis Coordinator, for her guidance, support, and constructive feedback, which ensured the proper alignment of this work with the academic standards for her encouragement and commitment to academic excellence.
Finally, heartfelt appreciation is given to the researcher’s family, friends, and everyone who offered support, motivation, and encouragement throughout the completion of this academic endeavor.
References
Abdullah, B., Amoudi, G., & Alghamdi, H. (2024). Advancements in sign language recognition: A comprehensive review and future prospects. IEEE Xplore. https://ieeexplore.ieee.org/abstract/document/10670380
Abdullahi, S., & Chamnongthai, K. (2022). American sign language words recognition using spatio-temporal prosodic and angle features: A sequential learning approach. IEEE Xplore. https://ieeexplore.ieee.org/abstract/document/9702061
Abiyev, R., & Bush, J. (2020). Sign language translation using deep convolutional neural networks. KSII Transactions on Internet and Information Systems, 14(2). https://doi.org/10.3837/tiis.2020.02.009
Adeyanju, I., Bello, O., & Adegboye, M. (2021). Machine learning methods for sign language recognition: A critical review and analysis. Machine Learning with Applications, 5, 100056. https://www.sciencedirect.com/science/article/pii/S2667305321000454
Akdag, A., & Baykan, O. K. (2024). Enhancing signer-independent recognition of isolated sign language through advanced deep learning techniques and feature fusion. Electronics, 13(7), 1188. https://doi.org/10.3390/electronics13071188
Alabdullah, B. I., Ansar, H., Mudawi, N. A., Alazeb, A., Alshahrani, A., Alotaibi, S. S., & Jalal, A. (2023). Smart home automation-based hand gesture recognition using feature fusion and recurrent neural network. Sensors, 23(17), 7523. https://doi.org/10.3390/s23177523
Aslani, S., & Jacob, J. (2022). Utilisation of deep learning for COVID-19 diagnosis. Computer Methods and Programs in Biomedicine, 224, 107015. https://www.sciencedirect.com/science/article/pii/S0009926022007188
Borg, M., & Camilleri, K. P. (2020). Phonologically meaningful subunits for deep learning-based sign language recognition. In Lecture Notes in Computer Science (pp. 199–217). https://doi.org/10.1007/978-3-030-66096-3_15
Bouarara, H., & Benyahia, K. (2024). Enhancing YOLOv3 with RNN models: Application to American sign language recognition for deaf individuals. Brazilian Journal of Technology. https://ojs.brazilianjournals.com.br/ojs/index.php/BJT/article/view/76225/53030
Cao, Y., Tang, Q., Wu, X., & Lu, X. (2021). EFFNet: Enhanced feature foreground network for video smoke source prediction and detection. IEEE Transactions on Circuits and Systems for Video Technology, 32(4), 1820–1833. https://doi.org/10.1109/TCVST.2021.3083112
Cayme, K. J., Retutal, V. A., Salubre, M. E., Astillo, P. V., Cañete, L. G., & Choudhary, G. (2024). Gesture recognition of Filipino sign language using convolutional and long short-term memory deep neural networks. Knowledge, 4(3), 358–381. https://doi.org/10.3390/knowledge4030020
Kakizaki, M., Miah, A. S. M., Hirooka, K., & Shin, J. (2024). Dynamic Japanese sign language recognition through hand pose estimation using effective feature extraction and classification approach. Sensors, 24(3), 826. https://doi.org/10.3390/s24030826
Kalita, D. (2025, May 1). What is recurrent neural networks (RNN)? Analytics Vidhya. https://www.analyticsvidhya.com/blog/2022/03/a-brief-overview-of-recurrent-neural-networks-rnn/
Karim, S., Tong, G., Li, J., Qadir, A., Farooq, U., & Yu, Y. (2023). Current advances and future perspectives of image fusion: A comprehensive review. Information Fusion, 90, 185–217. https://www.sciencedirect.com/science/article/pii/S1566253522001518
Li, C., Zhuang, B., Wang, G., Liang, X., Chang, X., & Yang, Y. (2022). Automated progressive learning for efficient training of vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://openaccess.thecvf.com/content/CVPR2022/html/Li_Automated_Progressive_Learning_for_Efficient_Training_of_Vision_Transformers_CVPR_2022_paper.html
Liang, Y., Jettanasen, C., & Chiradeja, P. (2024). Progression learning convolution neural model-based sign language recognition using wearable glove devices. Computation, 12(4), 72. https://doi.org/10.3390/computation12040072
Miah, A., Hasan, M., Nishimura, S., & Shin, J. (2024). Sign language recognition using graph and general deep neural network based on large-scale dataset. IEEE Xplore. https://ieeexplore.ieee.org/abstract/document/10456765
Munsif, M., Khan, S., Khan, N., & Baik, S. (2024). Attention-based deep learning framework for action recognition in a dark environment. Human-Centric Computing and Information Sciences. https://d1wqtxts1xzle7.cloudfront.net/110529634/Munsif-libre.pdf
Nogales, R. E., & Benalcázar, M. E. (2023). Hand gesture recognition using automatic feature extraction and deep learning algorithms with memory. Big Data and Cognitive Computing, 7(2), 102. https://doi.org/10.3390/bdcc7020102
Pathan, R. K., Biswas, M., Yasmin, S., Khandaker, M. U., Salman, M., & Youssef, A. A. F. (2023). Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network. Scientific Reports, 13(1), Article 43852. https://doi.org/10.1038/s41598-023-43852-x
Prakash, K., Eluri, R., Naidu, N., Nallamala, S., Mishra, P., & Dharani, P. (2020). Accurate hand gesture recognition using CNN and RNN approaches. International Journal of Advanced Trends in Computer Science and Engineering, 9(3). https://warse.org/IJATCSE/static/pdf/file/ijatcse114932020.pdf
Rivera-Acosta, M., Ruiz-Varela, J. M., Ortega-Cisneros, S., Rivera, J., Parra-Michel, R., & Mejia-Alvarez, P. (2021). Spelling correction real-time American Sign Language alphabet translation system based on YOLO network and LSTM. Electronics, 10(9), 1035. https://doi.org/10.3390/electronics10091035
Saleh, Y., & Issa, G. F. (2020). Arabic sign language recognition through deep neural networks fine-tuning. International Journal of Online and Biomedical Engineering (iJOE), 16(5), 71–83. https://doi.org/10.3991/ijoe.v16i05.13087
Shin, J., Matsuoka, A., Hasan, M. A. M., & Srizon, A. Y. (2021). American sign language alphabet recognition by extracting feature from hand pose estimation. Sensors, 21(17), 5856. https://doi.org/10.3390/s21175856
Tejas, T. T. (2024, October 9). Recurrent neural networks—Complete and in-depth. Medium. https://medium.com/analytics-vidhya/what-is-rnn-a157d903a88
Vyavahare, P., Dhawale, S., Takale, P., Koli, V., Kanawade, B., & Khonde, S. (2023). Detection and interpretation of Indian sign language using LSTM networks. Journal of Intelligent Systems and Control, 2(3), 132–142. https://doi.org/10.56578/jisc020302
Zhang, P., Yin, H., Wang, Z., Chen, W., Li, S., Wang, D., Lu, H., & Jia, X. (2024). EvSign: Sign language recognition and translation with streaming events. arXiv. https://arxiv.org/abs/2407.12593
Zhang, Y., Deng, L., Zhu, H., Wang, W., Ren, Z., Zhou, Q., Lu, S., Sun, S., Zhu, Z., Gorriz, J. M., & Wang, S. (2023). Deep learning in food category recognition. Information Fusion, 98, 101859. https://doi.org/10.1016/j.inffus.2023.101859
Cite this article:
Jimenez, J.R. & Sabordio, A.B.C. (2026). Enhancement of Recurrent Neural Networks (RNN) applied in hand gesture recognition for American Sign Language (ASL) alphabet recognition. International Student Research Review, 3(1), 22-40. https://doi.org/10.53378/isrr.212
License:
![]()
This work is licensed under a Creative Commons Attribution (CC BY 4.0) International License.
Related articles:
An enhancement of the Eigenface algorithm using weber local descriptor applied in attendance managem...
From loss to growth: A narrative study of transformational journeys among mothers after miscarriage
Enhancement of convolutional neural networks algorithm for application form using GlobalMaxPooling i...
Most read articles
- Senior High School Strand Alignment and Its Implication to The Tertiary Programs: A Basis for Bridging Program
- Reading Comprehension Difficulties Among Junior High School Learners
- Difficulties in the writing skills of Grade 11 HUMSS students
- Identifying gender stereotypes of high school LGBTQ students
- Factors Influencing Reading Comprehension and Difficulties Among Intermediate Learners: Basis For Developing Remedial Reading Intervention
- Lived experiences of senior high school focal persons in the implementation of work immersion program
- Disaster risk reduction and management on earthquake preparedness: An assessment
- Digital Marketing Strategies Used by Competing Coffee Shops in Candelaria, Quezon: Perspective of Employees
- Analysis of school rules and regulation implementation: Basis for policy enhancement program
- Technical vocational students’ higher learning institution preference and level of academic and skills preparedness
