Qwen3-TTS Streaming Arabic

Fork of Qwen3-TTS adding real-time streaming inference with full Arabic language support, achieving approximately 6× inference speedup over the baseline.

Key additions

OpenAI-compatible /v1/audio/speech endpoint with Server-Sent Events (SSE) streaming
Arabic language support via warm-start embedding initialization
Language auto-detection through Unicode scanning (Arabic vs Latin)
Optimized for NVIDIA DGX Spark (ARM64 / Grace Blackwell)
Docker support for ARM64 deployment

Arabic support approach

Arabic text is detected automatically via Unicode range scanning. The model uses warm-start embedding initialization — copying weights from existing Arabic phoneme embeddings — rather than training from scratch, enabling high-quality Gulf Arabic synthesis with significantly less data.

qwen3.5-TTS-Emirati — Emirati Arabic fine-tune
qwen3-TTS-KSA — Saudi Arabic fine-tune

Python · PyTorch · Docker · Apache 2.0

🔊 Qwen3-TTS Streaming Arabic

Key additions

Arabic support approach

Related models on HuggingFace