Play 3.0 mini: Built for Real-Time Interaction

Play 3.0 mini is a high performance, low-latency model voice AI built for use cases requiring low-latency and exceptional accuracy on alphanumeric sequences and acronyms

Trusted by teams at

“PlayAI’s models bring more natural, fluid sounding voices in multiple languages, and are delivered with ultra low latency. Their on-prem offering makes it a natural fit for our application, where data security is crucial.”

Keith Fearon, Head of Growth, 11x

< 150ms latency

40+ languages

Accurate voice cloning

On-prem deployments supported

Hear Play 3.0 mini in action.

Create engaging AI dialogs, podcasts and conversations using our proprietary Contextual Tone Prediction technology that lets the model understand each turn in a conversation and generate speech with the right prosody and emotion.

voice

AI podcast between hosts

Generate entire AI podcasts with any voices

Get Started
voice

Conversation between characters

Create engaging contextual conversations between multiple characters

Get Started
voice

Engaging narration

Generate rich dramatic narrative content

Get Started
voice

Dramatic dialogs for a scene

Prompt and direct to generate dramatic deliveries

Get Started

It's Accurate

Play 3.0 mini was finetuned on a diverse dataset of alpha-numeric phrases and supports critical use cases where information such as phone numbers, passport numbers, dates, currencies, etc. can’t be misread.

Play 3.0 Mini TTS Model Features
Fastest TTS Platform

It's Fast and Cost-Efficient

Play 3.0 mini is heavily optimized for low latency, and its smaller footprint means it's far more cost efficient that competitive models. Host it yourself if you need it even faster

It Supports 30+ Languages

Despite its small size, Play 3.0 mini supports 30+ languages, with a deep bench of voices OOTB for the most common languages. Say hola, bonjour really quickly

30+ Languages
Voice cloning benchmark

Supports High Fidelity Voice Cloning

Want an accurate voice clone for your application? Play 3.0 mini is the industry's most accurate voice cloning model, and it takes as little as 30 seconds.

It's Easy to Code

Play 3.0 Mini is easy to use and is available through our API and on platforms like Fal. It also supports Websockets and streaming from LLMs.

Generate spoken audio from input text

const options = {
  method: 'POST',
  headers: {
    AUTHORIZATION: '<api-key>',
    'X-USER-ID': '<api-key>',
    'Content-Type': 'application/json'
  },
  body: '{"model":"Play3.0-mini","text":"Hello! Said the realistic voice.","voice":"s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json","quality":"draft","outputFormat":"mp3","speed":1,"sampleRate":24000,"seed":null,"temperature":null,"voiceGuidance":null,"styleGuidance":null,"textGuidance":1,"language":"english"}'
};

fetch('https://api.play.ai/api/v1/tts/stream', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));
    
OnPrem

Need it on prem? No problem

PlayAI's models go where you need them, including on-prem for the highest security applications

PlayDialog is Enterprise ready

PlayDialog is GDPR, SOC 2 type II, and ISO2700 compliant. All models are available on request on cloud platforms or on-prem for the most demanding enterprise applications

Enterprise Certifications

Key Features

Lifelike voices

Play's TTS voice models lead the industry in voice quality, prosody and intonation.

Low latency

Time to first audio <150ms, less if on-prem deployment required

Easy to use

Voice AI generation and customization all supported by easy to use APIs.

Accuracy

Play 3.0 mini is fine-tuned to ensure accurate generation of acronyms, numerical sequences (e.g. phone, credit card numbers).

Multilingual

Play 3.0 mini supports 40+ languages

Security

All models are GDPR, ISO 27001 and SOC 2 type II compliant. On-prem also available.

Want to Talk to Our Team?

If you have an enterprise use case in mind, we'd love to hear from you.