Model Gallery

5 models from 1 repositories

Filter by type:

Filter by tags:

streaming-zipformer-en-sherpa
Streaming English ASR: sherpa-onnx zipformer transducer (int8, chunk-16 left-128). Low-latency real-time transcription with endpoint detection via sherpa-onnx's online recognizer. English-only; for multilingual offline ASR see omnilingual-0.3b-ctc-q8-sherpa.

Repository: localaiLicense: apache-2.0

qwen3-omni-30b-a3b-instruct
Qwen3-Omni is the natively end-to-end multilingual omni-modal foundation model. It processes text, images, audio, and video, and delivers real-time streaming responses in both text and natural speech. This GGUF build runs on llama.cpp with the bundled mmproj for multimodal inputs.

Repository: localaiLicense: apache-2.0

kokoros
Kokoros is a pure Rust TTS backend using the Kokoro v1.0 ONNX model (82M parameters). Fast, streaming TTS with high quality. American English with af_heart voice.

Repository: localaiLicense: apache-2.0

parakeet-cpp-realtime_eou_120m-v1
Cache-aware streaming RNNT FastConformer with end-of-utterance (EOU) detection, 120M. Use with streaming transcription. F16 GGUF for the parakeet-cpp backend (C++/ggml port of NVIDIA NeMo Parakeet), byte-identical to NeMo at WER 0. Faster than NeMo on CPU and GPU.

Repository: localaiLicense: cc-by-4.0

moonshine-streaming-crispasr
Moonshine Streaming Tiny speech recognition. Runs via the CrispASR backend with an explicit backend selector and a companion tokenizer. Default GGUF size ~31 MB.

Repository: localai