[期刊论文][Full-length article]


Development of a hybrid word recognition system and dataset for the Azerbaijani Sign Language dactyl alphabet

作   者:
Jamaladdin Hasanov;Nigar Alishzade;Aykhan Nazimzade;Samir Dadashzade;Toghrul Tahirov;

出版年:2023

页    码:102960 - 102960
出版社:Elsevier BV


摘   要:

The paper introduces a real-time fingerspelling-to-text translation system for the Azerbaijani Sign Language (AzSL), targeted to the clarification of the words with no available or ambiguous signs. The system consists of both statistical and probabilistic models, used in the sign recognition and sequence generation phases. Linguistic, technical, and human–computer interaction -related challenges, which are usually not considered in publicly available sign-based recognition application programming interfaces and tools, are addressed in this study. The specifics of the AzSL are reviewed, feature selection strategies are evaluated, and a robust model for the translation of hand signs is suggested. The two-stage recognition model exhibits high accuracy during real-time inference. Considering the lack of a publicly available dataset with the benchmark, a new, comprehensive AzSL dataset consisting of 13,444 samples collected by 221 volunteers is described and made publicly available for the sign language recognition community. To extend the dataset and make the model robust to changes, augmentation methods and their effect on the performance are analyzed. A lexicon-based validation method used for the probabilistic analysis and candidate word selection enhances the probability of the recognized phrases. Experiments delivered 94% accuracy on the test dataset, which was close to the real-time user experience. The dataset and implemented software are shared in a public repository for review and further research (CeDAR, 2021; Alishzade et al., 2022). The work has been presented at TeknoFest 2022 and ranked as the first in the category of social-oriented technologies .



关键字:

Sign language ; Sign language recognition ; Impaired hearing ; Computer vision ; Feature extraction ; Feature evaluation ; MediaPipe ; Dataset


所属期刊
Speech Communication
ISSN: 0167-6393
来自:Elsevier BV