Skip to content
How to Build an Aut...
 
Notifications
Clear all

How to Build an Automated Web Scraping Workflow with callin.io and Scrapeless

2 Posts
2 Users
0 Reactions
7 Views
Scrapeless
(@scrapeless)
Posts: 1
New Member
Topic starter
 

We've recently introduced an official integration on callin.io, now available as a public application. This guide will walk you through building a robust automated process that leverages our Google Search API alongside WebUnlocker to extract data from search results, process it using Claude AI, and then send it to a webhook.

What We’ll Build

In this tutorial, we will construct a workflow that:

  1. Initiates automatically each day via integrated scheduling

  2. Performs Google searches for specified queries using the Scrapeless Google Search API

  3. Processes each URL individually using the Iterator module

  4. Scrapes each URL with Scrapeless WebUnlocker to retrieve content

  5. Analyzes the content using Anthropic Claude AI

  6. Dispatches the processed data to a webhook (e.g., Discord, Slack, a database)

Prerequisites

  • A callin.io account

  • A Scrapeless API key (obtain one at scrapeless.com)

  • An Anthropic Claude API key

  • A webhook endpoint (e.g., Discord webhook, callin.io, database endpoint)

  • Basic familiarity with callin.io workflows

Complete Workflow Overview

Your final workflow will appear as follows:

Scrapeless Google Search (with integrated scheduling) → IteratorScrapeless WebUnlockerAnthropic ClaudeHTTP Webhook

Step 1: Adding Scrapeless Google Search with Integrated Scheduling

We will begin by adding the Scrapeless Google Search module, which includes built-in scheduling capabilities.

  1. Create a new scenario in callin.io

  2. Click the + button to add the initial module

  3. Search for “Scrapeless” in the module library

  4. Select Scrapeless and choose the Search Google action

Configuring Google Search with Scheduling

Connection Setup:

  1. Create a connection by entering your Scrapeless API key

  2. Click “Add” and follow the connection setup steps

Search Parameters:

  • Search Query: Input your target query (e.g., “artificial intelligence news”)

  • Language: en (English)

  • Country: US (United States)

Scheduling Setup:

  1. Click the clock icon on the module to access scheduling options

  2. Run scenario: Choose “At regular intervals”

  3. Minutes: Set to 1440 (for daily execution) or your desired frequency

  4. Advanced scheduling: Utilize “Add item” to specify particular times or days if necessary

Step 2: Processing Results with Iterator

The Google Search module returns multiple URLs in an array format. We will employ the Iterator module to handle each result individually.

  1. Add an Iterator module following the Google Search module

  2. Configure the Array field to process the search results

Iterator Configuration:

  • Array: {{1.result.organic_results}}

This configuration will establish a loop to process each search result independently, facilitating improved error management and individual handling.

Step 3: Adding Scrapeless WebUnlocker

Next, we will integrate the WebUnlocker module to scrape content from each URL.

  1. Add another Scrapeless module

  2. Select the Scrape URL (WebUnlocker) action

  3. Utilize the same Scrapeless connection established previously

WebUnlocker Configuration:

  • Connection: Use your existing Scrapeless connection

  • Target URL: {{2.link}} (mapped from the Iterator output)

  • Js Render: Yes

  • Headless: Yes

  • Country: World Wide

  • Js Instructions: `` [{

 
Posted : 25/06/2025 10:54 am
system
(@system)
Posts: 332
Reputable Member
 

This thread was automatically closed 30 days following the last response. New replies are no longer permitted.

 
Posted : 25/07/2025 10:55 am
Share: