LocalAI - Models

vllm-omni-qwen3-tts-custom-voice

Qwen3-TTS-12Hz-1.7B-CustomVoice via vLLM-Omni - Text-to-speech model from Alibaba Qwen team with custom voice cloning capabilities. Generates natural-sounding speech with voice personalization.

Links

https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Tags

qwen3-tts-cpp

Qwen3-TTS 0.6B Base (C++ / GGML, qwentts.cpp). Native C++ text-to-speech with streaming output and zero-shot voice cloning (set `voice` to a 24kHz reference .wav). 24kHz mono, 11 languages with Mandarin dialects. Q8_0 (~0.95 GB talker).

Links

Tags

qwen3-tts-cpp-0.6b-base-q4

Qwen3-TTS 0.6B Base (C++ / GGML, qwentts.cpp), Q4_K_M (~0.6 GB talker). Streaming + voice cloning, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-base

Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q8_0 (~2.0 GB talker). Higher-quality streaming + voice cloning, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-base-q4

Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q4_K_M (~1.2 GB talker). Streaming + voice cloning, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-customvoice

Qwen3-TTS 0.6B CustomVoice (C++ / GGML, qwentts.cpp), Q8_0. Named speakers selected via the `voice` field: serena, vivian, uncle_fu, ryan, aiden, ono_anna, sohee, eric (sichuan dialect), dylan (beijing dialect). Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-customvoice-q4

Qwen3-TTS 0.6B CustomVoice (C++ / GGML, qwentts.cpp), Q4_K_M. Named speakers via the `voice` field (serena, vivian, ryan, aiden, eric, dylan, ...). Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-customvoice

Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q8_0. Named speakers via the `voice` field (serena, vivian, ryan, aiden, eric, dylan, ...). Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-customvoice-q4

Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q4_K_M. Named speakers via the `voice` field. Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-voicedesign

Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q8_0. Synthesises a speaker from a free-text attribute instruction - REQUIRES the OpenAI `instructions` field (e.g. "male, young adult, moderate pitch"); requests without it are rejected. Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-cpp-1.7b-voicedesign-q4

Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q4_K_M. Synthesises a speaker from a free-text attribute instruction - REQUIRES the `instructions` field. Streaming, 24kHz mono, 11 languages.

Links

Tags

qwen3-tts-1.7b-custom-voice

Qwen3-TTS is a high-quality text-to-speech model supporting custom voice, voice design, and voice cloning.

Links

https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Tags

qwen3-tts-0.6b-custom-voice

Qwen3-TTS is a high-quality text-to-speech model supporting custom voice, voice design, and voice cloning.

Links

https://huggingface.co/Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice

Tags

qwen3-tts-customvoice-crispasr

Qwen3-TTS CustomVoice 0.6B (12 Hz) text-to-speech synthesized through the CrispASR backend. Fixed-speaker fine-tune driven via an explicit backend selector plus a tokenizer codec companion. Ships baked speakers (vivian, aiden, dylan, eric, ono_anna, ryan, serena, sohee, uncle_fu); the default config selects vivian. Runs end-to-end on CPU and produces 24 kHz mono audio. Default GGUF sizes ~968 MB (talker) + ~358 MB (tokenizer).

Links

https://huggingface.co/cstr/qwen3-tts-0.6b-customvoice-GGUF

Tags

Model Gallery

Filter by type:

Filter by tags:

vllm-omni-qwen3-tts-custom-voice

qwen3-tts-cpp

qwen3-tts-cpp-0.6b-base-q4

qwen3-tts-cpp-1.7b-base

qwen3-tts-cpp-1.7b-base-q4

qwen3-tts-cpp-customvoice

qwen3-tts-cpp-customvoice-q4

qwen3-tts-cpp-1.7b-customvoice

qwen3-tts-cpp-1.7b-customvoice-q4

qwen3-tts-cpp-1.7b-voicedesign

qwen3-tts-cpp-1.7b-voicedesign-q4

qwen3-tts-1.7b-custom-voice

qwen3-tts-0.6b-custom-voice

qwen3-tts-customvoice-crispasr