LocalAI - Models

vibevoice-cpp

VibeVoice Realtime 0.5B (C++ / GGML, Q8_0) - native C++ port of Microsoft VibeVoice via the vibevoice-cpp backend. 24kHz mono TTS with a selectable precomputed voice prompt. Default voice prompt: en-Carter_man. This realtime variant does not accept raw Voice Library reference WAVs.

Links

Tags

vibevoice-cpp-asr

VibeVoice ASR 7B (C++ / GGML, Q4_K) - long-form speech-to-text with speaker diarization. Returns per-speaker JSON segments with start/end timestamps. English-only. ~10 GB download.

Links

Tags

vibevoice

Links

https://github.com/microsoft/VibeVoice

Tags

vibevoice-crispasr

VibeVoice ASR. Runs via the CrispASR backend. Default GGUF size ~4.5 GB.

Links

https://huggingface.co/cstr/vibevoice-asr-GGUF

Tags

vibevoice-tts-crispasr

VibeVoice Realtime 0.5B text-to-speech (TTS) model, synthesized through the CrispASR backend. Produces 24 kHz mono audio; runs end-to-end on CPU with a built-in default voice. Default GGUF size ~636 MB.

Links

https://huggingface.co/cstr/vibevoice-realtime-0.5b-GGUF

Tags

Model Gallery

Filter by type:

Filter by tags:

vibevoice-cpp

vibevoice-cpp-asr

vibevoice

vibevoice-crispasr

vibevoice-tts-crispasr