Research Projects Blog Agent Skill Publications Contact
Projects  /  NeMo — Emirati Arabic G2P

🗣  NeMo — Emirati Arabic G2P

Fork of NVIDIA NeMo adding Emirati Arabic dialect support for VITS TTS. Custom EmiratiG2P module with IPA phonological rules for Gulf Arabic dialect-specific transformations.

G2P
Arabic
Emirati
IPA
VITS
NeMo
Phonology

Fork of NVIDIA NeMo introducing the EmiratiG2P module — a dialect-specific grapheme-to-phoneme converter for Emirati Arabic, enabling accurate IPA transcription for high-quality TTS synthesis.

EmiratiG2P module

~1,127 lines of phonological transformation rules extending NeMo’s IpaG2p base class, covering Gulf Arabic-specific phonology:

  • Qaf fronting — dialectal realization of ق as /g/ or /dʒ/
  • Diphthong monophthongization — Gulf Arabic vowel shifts
  • Sun letter assimilation — proper definite article phonology
  • Mixed code-switching — Arabic/English bilingual text handling

Configuration

YAML-based pipeline configuration with 20+ tunable parameters for phonological rule control. Includes 19 unit tests covering dialect edge cases.

Part of the Emirati TTS pipeline

Used with emirati-fastpitch-bilingual-v1.0, emirati-hifigan-bilingual-v1.0, and emirati-vits-male-1.0 for end-to-end Emirati Arabic speech synthesis.

Python · PyTorch · IPA · Apache 2.0