Notifications

Clear all

Workflow for Gemini 2.5 Pro Preview TTS

Features

Last Post by Sam_Smith 2 months ago

7 Posts

6 Users

0 Reactions

172 Views

RSS

Alex_Blanc

(@alex_blanc)

Posts: 1

New Member

Topic starter

Hello,
I’ve been using callin.io for just 2 months.
I want to utilize the new gemini-2.5-pro-preview-tts model. It’s available in the Gemini Chat Model.
However, I can’t find out how to use it. I’ve searched the internet but haven’t found anything, only this Gemini documentation (Génération de synthèse vocale | Gemini API | Google AI for Developers).
I asked Gemini, but all the answers it provided didn’t work.
Could someone help me create this workflow?
I want to convert the text from the output of an Agent IA node into an audio file and send it to a Google Drive Folder.
Thanks

Posted : 26/05/2025 7:28 am

Gallo_AIA

(@gallo_aia)

Posts: 22

Eminent Member

Hello! Welcome!

Currently, callin.io does not appear to directly support audio generation using the gemini-2.5-pro-preview-tts model. While the model is listed within the Gemini Chat models in callin.io, you're unable to configure the responseModalities parameter necessary for an audio response.

To utilize the
gemini-2.5-pro-preview-tts model for speech synthesis, you'll need to perform a direct HTTP request to the Gemini API, ensuring the required parameters are correctly set according to the documentation you provided.

Set up an HTTP Request node in callin.io

Method: POST
URL:

 https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-preview-tts:generateSpeech?key=YOUR_API_KEY

Headers:

Content-Type: application/json

Body (Raw JSON):

{
  "text": "{{ $json.text }}",
  "audioConfig": {
    "speakingRate": 1.0,
    "voice": {
      "name": "en-US-Standard-B"
    }
  }
}

Please substitute {{ $json.text }} with the actual text output from your Agent node.

Decode and save the audio

The response will include a base64-encoded audio file. You can add a Function node with the following code:

return [{
  binary: {
    data: {
      data: Buffer.from($json.audio.audioData, 'base64'),
      mimeType: 'audio/mp3',
      fileName: 'output.mp3'
    }
  }
}];

Upload to Google Drive

Let me know! Cheers

Posted : 26/05/2025 8:16 am

John_Song

(@john_song)

Posts: 2

New Member

I managed to get it working by implementing the following steps:

(One important detail is that the default output audio format is a .pcm file. This means you'll need to convert it to either WAV or MP3 for usability. If you're self-hosting callin.io, you can achieve this by installing ffmpeg into your Docker container. However, if you're using a cloud-based setup, you might need to utilize an external API service for the conversion.)

Posted : 29/05/2025 6:08 am

tytom2003

(@tytom2003)

Posts: 1

New Member

I tried your workflow. I can use Google Gemini TTS. Does Google Gemini require using ffmpeg to convert to a WAV file?

Posted : 10/06/2025 8:59 am

follow-prince

(@follow-prince)

Posts: 1

New Member

Thank you very much!

Posted : 13/06/2025 7:42 am

John_Song

(@john_song)

Posts: 2

New Member

Yes, I believe Gemini only provides audio files in .pcm format, requiring conversion to .wav or .mp3 for usage. If you're self-hosting callin.io, I found this method to be the most straightforward.

Posted : 13/06/2025 12:23 pm

Sam_Smith

(@sam_smith)

Posts: 1

New Member

This is fantastic! Just a heads-up: your API key is visible in the HTTP request.

Posted : 13/07/2025 10:48 am

9 Forums
1,470 Topics
8,130 Posts
8 Online
2,423 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed