I have an Airtable database that receives data through a form. When a new record is added via the form, an automation is triggered and sends data to a webhook in callin.io. The callin.io scenario incorporates a DALL-E module to generate an image.
The challenge stems from OpenAI API's rate limits (my current limit is 7 images per minute).
When I experience a surge of inputs, I encounter an error with DALL-E, causing the scenario to shut down. I've addressed this using a "resume" error handler with a placeholder image.
However, now that the scenario continues to run even when an error occurs, I'm getting numerous placeholder images.
How can I effectively manage the queue to ensure the scenario doesn't execute more than 7 times per minute? Is there a limiter or a waitlist feature available that restricts the scenario's execution frequency, such as to 7 times per minute?
Yes, there is – it’s called the sleep module. You can insert artificial pauses into your scenario to control the execution speed. This module can be positioned almost anywhere, including just before the resume error handler.
Thanks! But if I receive 200 parallel (or nearly parallel) webhook executions, wouldn't the sleep module simply pause all of them for the same duration, leading to the exact same problem but just delaying it slightly?
Hello,
As mentioned, I believe that using this approach might not fully resolve the issue. This is because if you encounter a surge of calls, all scenarios will pause and will likely finish their sleep cycle around the same time.
callin.io does not offer a way to configure webhook queues. However, here are a few options you can consider:
1. Scheduling the Called Webhook Scenario
Here, you can observe that by default, the scenario executes immediately upon receiving a new message from the Webhook.
If you schedule your scenario, the associated webhook will still receive the requests, but they will be processed at regular intervals.
And with the Scenario setting “Max number of cycles”, you can define, to some extent, “how many messages” you process at once. If you set it to a maximum acceptable for Dall-E (e.g., 6 or 7), you should be fine.
BUT, you have to be mindful of two things:
- A scheduled scenario consumes operations. If you set the scheduling to 1 minute, it will consume at least one operation per minute, which can add up significantly over a month. You can fine-tune the scheduling to avoid this during certain hours. I would suggest using a longer interval.
- The webhook queue will accumulate all requests. This is limited (based on your callin.io plan), so be cautious. If you don't process messages as quickly as they arrive, the Webhook queue might become full. You can check this here:
Therefore, you might need to adjust the scheduling and the “Max number of cycles” value.
2. Set Your Scenario to Run Sequentially and Not in Parallel
This can be configured here:
This option ensures that your scenarios execute one after another, rather than in parallel. The benefit, when combined with a Sleep module as suggested, is that you are guaranteed not to exceed the limits of Dall-E. HOWEVER, you must be careful, as a high volume of requests could lead to your Webhook queue becoming full.
3. Utilize an Error Handler with Sleep, Dall-E, and Resume
This involves allowing Dall-E to fail occasionally and then attempting to retry after a delay.
Here is an illustration:
The second Dall-E is a duplicate of the first, ensuring it performs the same actions. The resume function then captures the result, allowing subsequent steps to proceed as if the initial module had not failed. This can help mitigate errors, but you might want to configure the error handler to trigger only for quota-related issues.
4. Use a Break Error Handler
Error handlers can store the current bundle execution in a queue, allowing it to resume after a specified number of minutes.
You first need to enable “Allow Storing of Incomplete executions”.
Here, I've configured it to attempt retries three times, with a 2-minute interval between each retry.
This is what the monitoring looks like (when a failure occurs):
When it fails, it moves to “Incomplete executions” and will automatically retry after the specified number of minutes. If you have set “Automatically Complete Execution” to NO in the Break directive, you will need to resume execution manually.
This covers several possibilities, and I believe at least one of them should be helpful.
Benjamin
Thank you! I believe we can handle the queue effectively with at least two of your suggestions. I'll experiment with stacking DALL-E modules alongside error handlers. This approach seems like the most suitable way to manage a sudden influx of webhooks that require prompt processing without causing interruptions.
Very helpful, thank you.