Bring emotion to your game (or app)

Play's lifelike voices are fluid, human like (if you prefer), and emotive. Whether you want bloodcurdling screams or yippees, we've got you covered

Trusted by teams at

“With PlayAI expressive and emotional voices, FlipaClip artists can bring original characters to life in no time. This is changing how cartoons are made, giving anyone the power to create high-quality animated films on a mobile device”

Adri Ofman, COO Visual Blasters

Clone any character, human or not

Clone speech accurately in as little as ten seconds of recorded speech, use it in any language you like. Build an army of digital talent

Talk to sales

Get results in 125ms or less

Our efficient models and lightning fast inference give you <125ms time to first audio, less if you need our models on-prem

Talk to sales

Supports 30 languages

Our TTS makes localizing speech easy -your custom voices carry over to any language, making it a breeze

Talk to sales

Generate spoken audio from input text


  import * as PlayHT from 'playht';
  import fs from 'fs';

  // Create a file stream
  const fileStream = fs.createWriteStream('turbo-playht.mp3');

  // Stream audio from text
  const stream = await PlayHT.stream(
    'Stream realistic voices that say what you want!',
    {
      voiceEngine: 'PlayHT2.0-turbo',
      voiceId:
        's3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json',
      outputFormat: 'mp3',
    }
  );

  // Pipe stream into file
  stream.pipe(fileStream);

Easy to integrate

Our API is rock solid and easy to use.

Talk to sales

It's affordable

Game development is expensive. Our models are efficient, so we keep inference affordable. Call us if you've special requirements

Talk to sales

Key Features

Lifelike voices

Play's TTS voice models lead the industry in voice quality, prosody and intonation.

Low latency

Time to first audio as low as 125ms with Play 3.0 mini, less if on-prem deployment required

Easy to use

Voice AI generation and customization all supported by easy to use APIs.

Accuracy

Dialog is fine-tuned to ensure accurate generation of acronyms, numerical sequences (e.g. phone, credit card numbers).

Multilingual

English, Spanish, Arabic fully supported; 25+ languages under development

Security

All models are GDPR, ISO 27001 and SOC 2 type II compliant. On-prem also available.

Talk to an expert

Contact sales