Skip to content
Workflow for Gemini...
 
Notifications
Clear all

Workflow for Gemini 2.5 Pro Preview TTS

7 Posts
6 Users
0 Reactions
3 Views
Alex_Blanc
(@alex_blanc)
Posts: 1
New Member
Topic starter
 

Hello,
I’ve been using callin.io for just 2 months.
I want to utilize the new gemini-2.5-pro-preview-tts model. It’s available in the Gemini Chat Model.
However, I can’t find out how to use it. I’ve searched the internet but haven’t found anything, only this Gemini documentation (Génération de synthèse vocale  |  Gemini API  |  Google AI for Developers).
I asked Gemini, but all the answers it provided didn’t work.
Could someone help me create this workflow?
I want to convert the text from the output of an Agent IA node into an audio file and send it to a Google Drive Folder.
Thanks

 
Posted : 26/05/2025 7:28 am
Gallo_AIA
(@gallo_aia)
Posts: 22
Eminent Member
 

Hello! Welcome!

Currently, callin.io does not appear to directly support audio generation using the gemini-2.5-pro-preview-tts model. While the model is listed within the Gemini Chat models in callin.io, you're unable to configure the responseModalities parameter necessary for an audio response.

To utilize the
gemini-2.5-pro-preview-tts model for speech synthesis, you'll need to perform a direct HTTP request to the Gemini API, ensuring the required parameters are correctly set according to the documentation you provided.

Set up an HTTP Request node in callin.io

  • Method: POST
  • URL:
 https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-preview-tts:generateSpeech?key=YOUR_API_KEY 
  • Headers:
Content-Type: application/json
  • Body (Raw JSON):
{
  "text": "{{ $json.text }}",
  "audioConfig": {
    "speakingRate": 1.0,
    "voice": {
      "name": "en-US-Standard-B"
    }
  }
}

Please substitute {{ $json.text }} with the actual text output from your Agent node.

Decode and save the audio

The response will include a base64-encoded audio file. You can add a Function node with the following code:

return [{
  binary: {
    data: {
      data: Buffer.from($json.audio.audioData, 'base64'),
      mimeType: 'audio/mp3',
      fileName: 'output.mp3'
    }
  }
}];

Upload to Google Drive


Let me know! Cheers

 
Posted : 26/05/2025 8:16 am
John_Song
(@john_song)
Posts: 2
New Member
 

I managed to get it working by implementing the following steps:

(One important detail is that the default output audio format is a .pcm file. This means you'll need to convert it to either WAV or MP3 for usability. If you're self-hosting callin.io, you can achieve this by installing ffmpeg into your Docker container. However, if you're using a cloud-based setup, you might need to utilize an external API service for the conversion.)

 
Posted : 29/05/2025 6:08 am
tytom2003
(@tytom2003)
Posts: 1
New Member
 

I tried your workflow. I can use Google Gemini TTS. Does Google Gemini require using ffmpeg to convert to a WAV file?

 
Posted : 10/06/2025 8:59 am
follow-prince
(@follow-prince)
Posts: 1
New Member
 

Thank you very much!

 
Posted : 13/06/2025 7:42 am
John_Song
(@john_song)
Posts: 2
New Member
 

Yes, I believe Gemini only provides audio files in .pcm format, requiring conversion to .wav or .mp3 for usage. If you're self-hosting callin.io, I found this method to be the most straightforward.

 
Posted : 13/06/2025 12:23 pm
Sam_Smith
(@sam_smith)
Posts: 1
New Member
 

This is fantastic! Just a heads-up: your API key is visible in the HTTP request.

 
Posted : 13/07/2025 10:48 am
Share: