Skip to main navigation Skip to search Skip to main content

AfriHuBERT: A self-supervised speech representation model for African languages

  • Jesujoba O. Alabi
  • , Xuechen Liu
  • , Dietrich Klakow
  • , Junichi Yamagishi
  • Saarland University
  • Research Organization of Information and Systems, National Institute of Informatics

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

In this work, we present AfriHuBERT, an extension of mHuBERT-147, a compact self-supervised learning (SSL) model pretrained on 147 languages. While mHuBERT-147 covered 16 African languages, we expand this to 1,226 through continued pretraining on 10K+ hours of speech data from diverse sources, benefiting an African population of over 600M. We evaluate AfriHuBERT on two key speech tasks, Spoken Language Identification (SLID) and Automatic Speech Recognition (ASR), using the FLEURS benchmark. Our results show a +3.6% F1 score improvement for SLID and a -2.1% average Word Error Rate (WER) reduction for ASR over mHuBERT-147, and demonstrates competitiveness with larger SSL models such as MMS and XEUS. Further analysis shows that ASR models trained on AfriHuBERT exhibit improved cross-corpus generalization and are competitive in extremely low-resource ASR scenarios.

Original languageEnglish
Pages (from-to)4023-4027
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

Keywords

  • African languages
  • Multilingual speech representation
  • Self-supervised learning
  • Speech processing

Cite this