Skip to content
How to manage OpenA...
 
Notifications
Clear all

How to manage OpenAI API rate limits with callin.io?

4 Posts
4 Users
0 Reactions
4 Views
pachocastillosr
(@pachocastillosr)
Posts: 1
New Member
Topic starter
 

Hi,

OpenAI’s API has rate limits.

For chat completions during the free trial, the API rate limits are currently:

  1. Requests per minute: 3 requests per minute.
  2. Tokens per minute: 40,000 tokens per minute.
    The amount of tokens consumed will be provided in each API response (see the image below).

ejfoqw.png

My use case involves iterating through a list of products. For each product, I send it to the OpenAI API to get a description from GPT.

Each product requires 4 separate requests to OpenAI (meaning 4 requests per loop iteration), which is causing errors due to the aforementioned OpenAI API rate limits.

I've looked into how this would be handled in Bubble and Xano, and both seem to require a rather complex queuing system, making it feel like I'm constantly battling the API.

ejfoqw

I'm curious to know how this would be managed in callin.io, to see if I can avoid these API conflicts.

How can one configure callin.io to respect both the requests per minute and tokens per minute rate limits?

Any assistance would be greatly appreciated.

Note: The OpenAI API might be accessed by other callin.io workflows besides the one for product descriptions, consuming tokens and requests from the same API key rate limit. It should be possible to run parallel calls (not one by one) to the OpenAI API, provided there's sufficient capacity within the rate limit.

Any guidance would be helpful!

 
Posted : 20/04/2023 3:29 pm
sirdavidoff
(@sirdavidoff)
Posts: 13
Active Member
 

Hi, welcome to the community!

I’m afraid callin.io doesn’t currently have any functionality to track request rates across different workflows. You could build this yourself by storing a counter of the number of requests/credits you’ve used in the past minute and checking that before making requests. But you’d have to store this counter in a DB somewhere.

 
Posted : 21/04/2023 10:21 am
mxeise
(@mxeise)
Posts: 1
New Member
 

Hi, any guidance on how to apply that to an AI Writer Node? I’ve set the retry to Timeout 5000 (which seems to be the maximum), but the rate limit still applies.

As I have a „Delegate to Writers“ node before this, I don’t think I can build in the logic you’ve described, or is it possible?

Thanks for your help

:slight_smile:

 
Posted : 15/10/2024 3:30 pm
pemontto
(@pemontto)
Posts: 1
New Member
 

Ideally, the AI nodes themselves manage rate limiting and backoff. For OpenAI, this involves inspecting the headers, as detailed here: https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers .

For example, you should wait until x-ratelimit-remaining-requests and x-ratelimit-remaining-tokens are greater than 0, or until both x-ratelimit-reset-requests and x-ratelimit-reset-tokens are 0.

 
Posted : 16/10/2024 6:07 pm
Share: