Research Projects Blog Agent Skill Publications Contact
Projects  /  Emirati Arabic TTS

🗣  Emirati Arabic TTS

Production-grade TTS models for Emirati Arabic dialect — bilingual FastPitch/HiFi-GAN, VITS end-to-end, and Qwen3.5-based TTS with extended text normalization pipelines.

TTS
Arabic
NeMo
FastPitch
VITS
Qwen3.5

Overview

The Emirati Arabic TTS project focuses on building high-quality speech synthesis for the Emirati Arabic dialect — one of the most underrepresented Arabic varieties in speech AI. The project spans multiple model generations and architectures, from classical acoustic model + vocoder pipelines to modern end-to-end and LLM-based TTS systems.

Models

emirati-fastpitch-bilingual-v1.0

Bilingual (Arabic + English) FastPitch acoustic model trained on Emirati dialect speech data. Uses NVIDIA NeMo framework with custom text normalization for Arabic numerals, abbreviations, and mixed-language input. Paired with emirati-hifigan-bilingual-v1.0 vocoder for waveform generation.

  • Framework: NVIDIA NeMo
  • Architecture: FastPitch (non-autoregressive transformer)
  • Language: Emirati Arabic / English bilingual
  • Features: Extended TTS frontend, G2P, number verbalization

emirati-vits-male-1.0

End-to-end VITS model for Emirati male voice synthesis. VITS combines acoustic modeling and vocoding in a single network, enabling lower latency and more natural prosody compared to two-stage pipelines.

  • Architecture: VITS (end-to-end)
  • Voice: Male Emirati speaker
  • Training: Custom Emirati dialect dataset

qwen3.5-TTS-Emirati

Latest generation Emirati TTS based on the Qwen3.5 large language model architecture fine-tuned for speech synthesis. Enables more natural intonation, better handling of dialectal features, and improved mixed Arabic/English codeswitching.

  • Base model: Qwen3.5 (fine-tuned for TTS)
  • Language: Emirati Arabic with codeswitching support
  • 78+ downloads on HuggingFace

Text Normalization Pipeline

All Emirati TTS models are backed by a production-grade TTS frontend pipeline covering:

  • Arabic numeral verbalization (cardinal, ordinal, currency, dates)
  • Abbreviation and acronym expansion
  • Grapheme-to-Phoneme (G2P) for Arabic phoneme inventory
  • Unicode normalization and script detection
  • Mixed Arabic/English text handling