KD-MSLRT: Lightweight Sign Language Recognition Model Based on Mediapipe and 3D to 1D Knowledge Distillation

Yulong Li, Bolin Ren, Ke Hu, Changyuan Liu, Zhengyong Jiang, Kang Dang*, Jionglong Su*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Artificial intelligence has achieved notable results in sign lan-guage recognition and translation. However, relatively fewefforts have been made to significantly improve the qualityof life for the 72 million hearing-impaired people worldwide.Sign language translation models, relying on video inputs, in-volves with large parameter sizes, making it time-consumingand computationally intensive to be deployed. This directlycontributes to the scarcity of human-centered technology inthis field. Additionally, the lack of datasets in sign languagetranslation hampers research progress in this area. To addressthese, we first propose a cross-modal multi-knowledge distil-lation technique from 3D to 1D and a novel end-to-end pre-training text correction framework. Compared to other pre-trained models, our framework achieves significant advance-ments in correcting text output errors. Our model achievesa decrease in Word Error Rate (WER) of at least 1.4% onPHOENIX14 and PHOENIX14T datasets compared to thestate-of-the-art CorrNet. Additionally, the TensorFlow Lite(TFLite) quantized model size is reduced to 12.93 MB, mak-ing it the smallest, fastest, and most accurate model to date.We have also collected and released extensive Chinese signlanguage datasets, and developed a specialized training vo-cabulary. To address the lack of research on data augmenta-tion for landmark data, we have designed comparative exper-iments on various augmentation methods. Moreover, we per-formed a simulated deployment and prediction of our modelon Intel platform CPUs and assessed the feasibility of deploy-ing the model on other platforms.
Original languageEnglish
Title of host publicationThe 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
PublisherAAAI press
Publication statusAccepted/In press - 2025

Fingerprint

Dive into the research topics of 'KD-MSLRT: Lightweight Sign Language Recognition Model Based on Mediapipe and 3D to 1D Knowledge Distillation'. Together they form a unique fingerprint.

Cite this