Google · Audio

Gemini 3.1 Flash

Fast Gemini text-to-speech with natural-sounding expressive voices.

Gemini 3.1 Flash is a cloud text-to-speech model from Google. It converts written text into natural, spoken audio. It offers a choice of 30 voices. It runs through Replicate using your own API key, from about $0.0003 per second of output.

Modality

Audio

Available on

Replicate

Model ID

google/gemini-3.1-flash-tts

Specs

Pricing: $0.0003 per second
Type: Text-to-speech
Voices: 30 to choose from

View provider documentation ↗

About the creator

Google

Google and Google DeepMind build the Gemini family of multimodal models, the Imagen and Nano Banana image models, the Lyria music models, and the Veo video models.

deepmind.google ↗

Samples

Examples

Sample outputs generated with Gemini 3.1 Flash will appear here.

Sample coming soon

Gemini 3.1 Flash

Google

Examples

One-time payment. Yours forever.