Play 3.0 mini: Built for Real-Time Interaction

Play 3.0 mini is a high performance, low-latency model voice AI built for use cases requiring low-latency and exceptional accuracy on alphanumeric sequences and acronyms

Trusted by teams at

“PlayAI’s models bring more natural, fluid sounding voices in multiple languages, and are delivered with ultra low latency. Their on-prem offering makes it a natural fit for our application, where data security is crucial.”

Keith Fearon, Head of Growth, 11x

< 150ms latency

40+ languages

Accurate voice cloning

On-prem deployments supported

Hear Play 3.0 mini in action.

Create engaging AI dialogs, podcasts and conversations using our proprietary Contextual Tone Prediction technology that lets the model understand each turn in a conversation and generate speech with the right prosody and emotion.

AI podcast between hosts

Generate entire AI podcasts with any voices

Get Started

Conversation between characters

Create engaging contextual conversations between multiple characters

Get Started

Engaging narration

Generate rich dramatic narrative content

Get Started

Dramatic dialogs for a scene

Prompt and direct to generate dramatic deliveries

Get Started

It's Accurate

Play 3.0 mini was finetuned on a diverse dataset of alpha-numeric phrases and supports critical use cases where information such as phone numbers, passport numbers, dates, currencies, etc. can’t be misread.

Try playground

It's Fast and Cost-Efficient

Play 3.0 mini is heavily optimized for low latency, and its smaller footprint means it's far more cost efficient that competitive models. Host it yourself if you need it even faster

Contact sales

It Supports 30+ Languages

Despite its small size, Play 3.0 mini supports 30+ languages, with a deep bench of voices OOTB for the most common languages. Say hola, bonjour really quickly

Try playground

Supports High Fidelity Voice Cloning

Want an accurate voice clone for your application? Play 3.0 mini is the industry's most accurate voice cloning model, and it takes as little as 30 seconds.

Read docs

It's Easy to Code

Play 3.0 Mini is easy to use and is available through our API and on platforms like Fal. It also supports Websockets and streaming from LLMs.

Read docs

Generate spoken audio from input text


const options = {
  method: 'POST',
  headers: {
    AUTHORIZATION: '<api-key>',
    'X-USER-ID': '<api-key>',
    'Content-Type': 'application/json'
  },
  body: '{"model":"Play3.0-mini","text":"Hello! Said the realistic voice.","voice":"s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json","quality":"draft","outputFormat":"mp3","speed":1,"sampleRate":24000,"seed":null,"temperature":null,"voiceGuidance":null,"styleGuidance":null,"textGuidance":1,"language":"english"}'
};

fetch('https://api.play.ai/api/v1/tts/stream', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));

Need it on prem? No problem

PlayAI's models go where you need them, including on-prem for the highest security applications

Contact sales

PlayDialog is Enterprise ready

PlayDialog is GDPR, SOC 2 type II, and ISO2700 compliant. All models are available on request on cloud platforms or on-prem for the most demanding enterprise applications

Contact sales