Don't just take our word for it - In blind human preference testing, PlayDialog beat the industry's leading model by 3:1
Our low-latency TTS models have TTFA (Time to first audio) as low as 125ms through our API, and even less if you require an on-prem solution.
Our voice models are fine tuned to handle complex acronyms and numerical sequences like credit cards and phone numbers accurately, with correct pace and intonation
Our Play 3.0 mini model supports 30 languages, many with multiple male and female voice out of the box
All voice AI models are easy to use through our APIs and SDKs, and support websockets, SIP trunking. Get your voice app up and running in hours not weeks.
const options = {
method: 'POST',
headers: {
AUTHORIZATION: '<api-key>',
'X-USER-ID': '<api-key>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'PlayDialog',
text: `Country Mouse: Welcome to my humble home, cousin!
Town Mouse: Thank you, cousin. It's quite... peaceful here.
Country Mouse: It is indeed. I hope you're hungry.
I've prepared a simple meal of beans, barley, and fresh roots.
Town Mouse: Well, it's... earthy. Do you eat this every day?`,
voice: 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
voice2: 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
outputFormat: 'mp3',
speed: 1,
sampleRate: 44100,
seed: null,
temperature: null,
turnPrefix: 'Country Mouse:',
turnPrefix2: 'Town Mouse:',
prompt: '<string>',
prompt2: '<string>',
voiceConditioningSeconds: 20,
voiceConditioningSeconds2: 20,
language: 'english',
webHookUrl: '<string>',
}),
};
fetch('https://api.play.ai/api/v1/tts', options)
.then(response => response.json())
.then(response => console.log(response))
.catch(err => console.error(err));
Our platform secures data at rest and in transit, and we're ISO 27001, GDPR, SOC 2 type II compliant. We support on-prem deployments for the most demanding applications
Play's TTS voice models lead the industry in voice quality, prosody and intonation.
Time to first audio as low as 125ms with Play 3.0 mini, less if on-prem deployment required
Voice AI generation and customization all supported by easy to use APIs.
Dialog is fine-tuned to ensure accurate generation of acronyms, numerical sequences (e.g. phone, credit card numbers).
English, Spanish, Arabic fully supported; 25+ languages under development
All models are GDPR, ISO 27001 and SOC 2 type II compliant. On-prem also available.