Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB
1 min readKitten ML has announced the release of three new tiny text-to-speech models optimized for on-device deployment. The models come in three sizes—80M, 40M, and 14M parameters—all under 25MB, making them ideal for edge inference scenarios where storage and memory are severely constrained. All weights and code are available under the permissive Apache 2.0 license, enabling both commercial and research use.
This release is significant for local LLM practitioners building complete AI systems that require audio output capabilities. Previously, quality TTS required either cloud APIs or substantial computational overhead. These Kitten models maintain expressive speech generation despite their tiny footprint, making it feasible to run end-to-end local inference pipelines on consumer hardware—from edge devices to Mac minis. The community can access the models through GitHub and collaborate via their Discord server.
With inference frameworks like Ollama and llama.cpp already supporting audio processing, these ultra-lightweight TTS models fill a critical gap for developers building offline-first applications requiring both language understanding and speech synthesis.
Source: r/LocalLLaMA · Relevance: 9/10