Research Projects Blog Agent Skill Publications Contact
Projects  /  Qwen3-ASR Arabic UAE

๐ŸŽ™  Qwen3-ASR Arabic UAE

Fine-tuned Qwen3-ASR-1.7B for UAE Emirati Arabic dialect speech recognition. 26% WER improvement over the base model, with Arabic text normalization and dialect-aware transcription.

ASR
Arabic
Emirati
Qwen3
Speech Recognition
Fine-tuned

Fine-tuned Qwen3-ASR-1.7B (2B params) for UAE Emirati Arabic dialect, achieving a 26% WER reduction over the zero-shot baseline.

Evaluation results

MetricZero-shot (base)Fine-tunedImprovement
WER13.53%9.98%-26%
CER3.33%2.55%-23%

Evaluated on 2,497 UAE Arabic validation samples.

Training approach

  • Audio encoder frozen, LLM decoder fine-tuned (84.4% of parameters)
  • Trained on ~22,500 UAE Emirati Arabic dialect samples from the UAE Arabic English Bilingual Dataset (40K)
  • Arabic text normalization: diacritics removal, alef/teh marbuta normalization, punctuation stripping
  • 3 epochs, bfloat16 precision, learning rate 2e-5

What improved

  • Matches informal Emirati dialect style naturally
  • Removes spurious punctuation the base model hallucinated
  • Better handling of dialect-specific words and expressions

Live demo

Try the model yourself: Arabic ASR UAE Demo

Python ยท PyTorch ยท Transformers ยท Apache 2.0