Notifications

Clear all

How to Build an Automated Web Scraping Workflow with callin.io and Scrapeless

How To

Last Post by system 4 months ago

2 Posts

2 Users

0 Reactions

596 Views

RSS

Scrapeless

(@scrapeless)

Posts: 1

New Member

Topic starter

We've recently introduced an official integration on callin.io, now available as a public application. This guide will walk you through building a robust automated process that leverages our Google Search API alongside WebUnlocker to extract data from search results, process it using Claude AI, and then send it to a webhook.

What We’ll Build

In this tutorial, we will construct a workflow that:

Initiates automatically each day via integrated scheduling
Performs Google searches for specified queries using the Scrapeless Google Search API
Processes each URL individually using the Iterator module
Scrapes each URL with Scrapeless WebUnlocker to retrieve content
Analyzes the content using Anthropic Claude AI
Dispatches the processed data to a webhook (e.g., Discord, Slack, a database)

Prerequisites

A callin.io account
A Scrapeless API key (obtain one at scrapeless.com)

An Anthropic Claude API key
A webhook endpoint (e.g., Discord webhook, callin.io, database endpoint)
Basic familiarity with callin.io workflows

Complete Workflow Overview

Your final workflow will appear as follows:

Scrapeless Google Search (with integrated scheduling) → Iterator → Scrapeless WebUnlocker → Anthropic Claude → HTTP Webhook

Step 1: Adding Scrapeless Google Search with Integrated Scheduling

We will begin by adding the Scrapeless Google Search module, which includes built-in scheduling capabilities.

Create a new scenario in callin.io
Click the + button to add the initial module
Search for “Scrapeless” in the module library
Select Scrapeless and choose the Search Google action

Configuring Google Search with Scheduling

Connection Setup:

Create a connection by entering your Scrapeless API key
Click “Add” and follow the connection setup steps

Search Parameters:

Search Query: Input your target query (e.g., “artificial intelligence news”)
Language: en (English)
Country: US (United States)

Scheduling Setup:

Click the clock icon on the module to access scheduling options
Run scenario: Choose “At regular intervals”
Minutes: Set to 1440 (for daily execution) or your desired frequency
Advanced scheduling: Utilize “Add item” to specify particular times or days if necessary

Step 2: Processing Results with Iterator

The Google Search module returns multiple URLs in an array format. We will employ the Iterator module to handle each result individually.

Add an Iterator module following the Google Search module
Configure the Array field to process the search results

Iterator Configuration:

Array: {{1.result.organic_results}}

This configuration will establish a loop to process each search result independently, facilitating improved error management and individual handling.

Step 3: Adding Scrapeless WebUnlocker

Next, we will integrate the WebUnlocker module to scrape content from each URL.

Add another Scrapeless module
Select the Scrape URL (WebUnlocker) action
Utilize the same Scrapeless connection established previously

WebUnlocker Configuration:

Connection: Use your existing Scrapeless connection
Target URL: {{2.link}} (mapped from the Iterator output)
Js Render: Yes
Headless: Yes
Country: World Wide
Js Instructions: `` [{

Posted : 25/06/2025 10:54 am

system

(@system)

Posts: 241

Estimable Member

This thread was automatically closed 30 days following the last response. New replies are no longer permitted.

Posted : 25/07/2025 10:55 am

8 Forums
996 Topics
5,534 Posts
0 Online
2,445 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed