Fine-tuned Qwen3-ASR-1.7B (2B params) for UAE Emirati Arabic dialect, achieving a 26% WER reduction over the zero-shot baseline.
Evaluation results
| Metric | Zero-shot (base) | Fine-tuned | Improvement |
|---|---|---|---|
| WER | 13.53% | 9.98% | -26% |
| CER | 3.33% | 2.55% | -23% |
Evaluated on 2,497 UAE Arabic validation samples.
Training approach
- Audio encoder frozen, LLM decoder fine-tuned (84.4% of parameters)
- Trained on ~22,500 UAE Emirati Arabic dialect samples from the UAE Arabic English Bilingual Dataset (40K)
- Arabic text normalization: diacritics removal, alef/teh marbuta normalization, punctuation stripping
- 3 epochs, bfloat16 precision, learning rate 2e-5
What improved
- Matches informal Emirati dialect style naturally
- Removes spurious punctuation the base model hallucinated
- Better handling of dialect-specific words and expressions
Live demo
Try the model yourself: Arabic ASR UAE Demo
Python ยท PyTorch ยท Transformers ยท Apache 2.0