Designed
for real-time use, it combines multilingual speech, voice cloning from seconds
of audio, and conversational-level prosody in a single system
San
Francisco, CA | March 27, 2026 : Smallest.ai, the research-first
Voice AI company building proprietary speech models and production-grade voice
agents, today announced the launch of Lightning V3,
its most advanced text-to-speech (TTS) model for real-time, conversational AI.
In
conversational evaluations, Lightning V3 achieves a 3.89 MOS,
outperforming leading models from OpenAI, Cartesia, and ElevenLabs,
while also leading on intonation (3.33) and prosody (3.07)- two
of the most critical factors for natural, human-like speech. The model combines
this performance with multilingual support, instant voice cloning, and
streaming generation designed for real-world interactions.
Most
TTS models today are still evaluated on complete sentences generated in isolation.
That setup is easier to optimize for, but it doesn’t reflect how voice systems
actually behave in production- where audio is generated in chunks, context is
incomplete, and responses have to adapt as conversations unfold.
Lightning V3 is built for how voice
systems actually run in production- generating speech in chunks, without full
context, and adapting as conversations evolve. It maintains consistency across
turns and adjusts tone and pacing mid-sentence, which is where most systems
break down.
That
same setup allows the model to work across use cases without retraining- including voice agents, contact centers,
podcasts, audiobooks, dubbing, and interactive applications.
It
supports 15 languages with automatic detection and mid-sentence
switching, and can clone a voice from 5–15 seconds of audio. These
cloned voices tend to sound more natural than preset ones, since they retain
the variations of real speech.
The
model outputs audio at 44.1 kHz, and can be downsampled to 8–24 kHz
for telephony.
“Conversation
is where most voice systems fall apart,” said Sudarshan Kamath,
Founder and CEO, Smallest.ai. “It’s not just about
sounding clear- the voice has to track context, timing, and emotion at the same
time. If it works there, it works everywhere.”
A shift in
how voice quality is measured
The
launch also challenges how voice models are evaluated. Most benchmarks rely on
static outputs- a setup that rarely reflects real usage.
Lightning
V3 is evaluated across these use case specific settings,
measuring how well the voice maintains coherence, responsiveness, and
believability throughout an interaction, in the given context of the
conversation not just within a single utterance.
Voices
should be designed and judged in context: for whether they fit the persona they
are meant to inhabit, carry the right social signal, and feel believable in the
moment they were built for.
Pricing
Lightning
V3.1 is available on a pay-as-you-go model, with no upfront commitments,
seat licenses, or minimum usage requirements.
Teams
can scale from early prototypes to high-volume deployments across both voice
agents and content generation- with usage-based pricing and non-expiring
credits.
About
Smallest.ai
Smallest.ai
is a research-first Voice AI company building proprietary speech models and
production-grade voice agents for regulated enterprises.
The
company develops state of the art speech-to-text, text-to-speech, and real-time
voice systems, enabling end-to-end automation of high-volume conversations
across support, collections, onboarding, and servicing- without relying on
stitched third-party APIs.
Designed
for financial services and other regulated industries, Smallest.ai is SOC 2,
GDPR, HIPAA, and PCI compliant, supports on-prem and private cloud
deployments, and operates reliably in multilingual environments.
Its
platform is used in production by enterprises across banking, insurance, BPO
and telecommunications in the US and India.
