Machine vs. deep learning comparision for developing an international sign language translator


Eryilmaz M., Balkaya E., Uçan E., Turan G., Oral S. G.

Journal of Experimental and Theoretical Artificial Intelligence, vol.36, no.6, pp.975-984, 2024 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 36 Issue: 6
  • Publication Date: 2024
  • Doi Number: 10.1080/0952813x.2022.2115560
  • Journal Name: Journal of Experimental and Theoretical Artificial Intelligence
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Compendex, Computer & Applied Sciences, INSPEC, Psycinfo, zbMATH
  • Page Numbers: pp.975-984
  • Keywords: Sign language, hearing loss, video classification, machine learning, deep learning
  • TED University Affiliated: No

Abstract

© 2022 Informa UK Limited, trading as Taylor & Francis Group.This study aims to enable deaf and hard-of-hearing people to communicate with other individuals who know and do not know sign language. The mobile application was developed for video classification by using MediaPipe Library in the study. While doing this, considering the problems that deaf and hearing loss individuals face in Turkey and abroad modelling and training stages were carried out with the English language option. With the real-time translation feature added to the study individuals were provided with instant communication. In this way, communication problems experienced by hearing-impaired individuals will be greatly reduced. Machine learning and Deep learning concepts were investigated in the study. Model creation and training stages were carried out using VGG16, OpenCV, Pandas, Keras, and Os libraries. Due to the low success rate in the model created using VGG16, the MediaPipe library was used in the formation and training stages of the model. The reason for this is that, thanks to the solutions available in the MediaPipe library, it can normalise the coordinates in 3D by marking the regions to be detected in the human body. Being able to extract the coordinates independently of the background and body type in the videos in the dataset increases the success rate of the model in the formation and training stages. As a result of an experiment, the accuracy rate of the deep learning model is 85% and the application can be easily integrated with different languages. It is concluded that deep learning model is more accure than machine learning one and the communication problem faced by hearing-impaired individuals in many countries can be reduced easily.