Research Projects Blog Agent Skill Publications Contact
Projects  /  Qwen3-TTS Streaming Arabic

๐Ÿ”Š  Qwen3-TTS Streaming Arabic

Real-time PCM streaming TTS with ~6x inference speedup and Arabic language support built on Qwen3-TTS. OpenAI-compatible API endpoint with Server-Sent Events streaming.

TTS
Arabic
Streaming
Qwen3
Docker
OpenAI API

Fork of Qwen3-TTS adding real-time streaming inference with full Arabic language support, achieving approximately 6ร— inference speedup over the baseline.

Key additions

  • OpenAI-compatible /v1/audio/speech endpoint with Server-Sent Events (SSE) streaming
  • Arabic language support via warm-start embedding initialization
  • Language auto-detection through Unicode scanning (Arabic vs Latin)
  • Optimized for NVIDIA DGX Spark (ARM64 / Grace Blackwell)
  • Docker support for ARM64 deployment

Arabic support approach

Arabic text is detected automatically via Unicode range scanning. The model uses warm-start embedding initialization โ€” copying weights from existing Arabic phoneme embeddings โ€” rather than training from scratch, enabling high-quality Gulf Arabic synthesis with significantly less data.

  • qwen3.5-TTS-Emirati โ€” Emirati Arabic fine-tune
  • qwen3-TTS-KSA โ€” Saudi Arabic fine-tune

Python ยท PyTorch ยท Docker ยท Apache 2.0