Notifications

Clear all

How to manage OpenAI API rate limits with callin.io?

How To

Last Post by pemontto 1 year ago

4 Posts

4 Users

0 Reactions

278 Views

RSS

pachocastillosr

(@pachocastillosr)

Posts: 1

New Member

Topic starter

Hi,

OpenAI’s API has rate limits.

For chat completions during the free trial, the API rate limits are currently:

Requests per minute: 3 requests per minute.
Tokens per minute: 40,000 tokens per minute.
The amount of tokens consumed will be provided in each API response (see the image below).

My use case involves iterating through a list of products. For each product, I send it to the OpenAI API to get a description from GPT.

Each product requires 4 separate requests to OpenAI (meaning 4 requests per loop iteration), which is causing errors due to the aforementioned OpenAI API rate limits.

I've looked into how this would be handled in Bubble and Xano, and both seem to require a rather complex queuing system, making it feel like I'm constantly battling the API.

ejfoqw

I'm curious to know how this would be managed in callin.io, to see if I can avoid these API conflicts.

How can one configure callin.io to respect both the requests per minute and tokens per minute rate limits?

Any assistance would be greatly appreciated.

Note: The OpenAI API might be accessed by other callin.io workflows besides the one for product descriptions, consuming tokens and requests from the same API key rate limit. It should be possible to run parallel calls (not one by one) to the OpenAI API, provided there's sufficient capacity within the rate limit.

Any guidance would be helpful!

Posted : 20/04/2023 3:29 pm

sirdavidoff

(@sirdavidoff)

Posts: 12

Active Member

Hi, welcome to the community!

I’m afraid callin.io doesn’t currently have any functionality to track request rates across different workflows. You could build this yourself by storing a counter of the number of requests/credits you’ve used in the past minute and checking that before making requests. But you’d have to store this counter in a DB somewhere.

Posted : 21/04/2023 10:21 am

mxeise

(@mxeise)

Posts: 1

New Member

Hi, any guidance on how to apply that to an AI Writer Node? I’ve set the retry to Timeout 5000 (which seems to be the maximum), but the rate limit still applies.

As I have a „Delegate to Writers“ node before this, I don’t think I can build in the logic you’ve described, or is it possible?

Thanks for your help

Posted : 15/10/2024 3:30 pm

pemontto

(@pemontto)

Posts: 1

New Member

Ideally, the AI nodes themselves manage rate limiting and backoff. For OpenAI, this involves inspecting the headers, as detailed here: https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers .

For example, you should wait until x-ratelimit-remaining-requests and x-ratelimit-remaining-tokens are greater than 0, or until both x-ratelimit-reset-requests and x-ratelimit-reset-tokens are 0.

Posted : 16/10/2024 6:07 pm

8 Forums
996 Topics
5,534 Posts
1 Online
2,445 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed