Skip to content

OpenClaw skill for scraping any URL using the Decodo Web Scraping API.

Notifications You must be signed in to change notification settings

Decodo/decodo-openclaw-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decodo Scraper OpenClaw Skill

Python Version License

Overview

This OpenClaw skill integrates Decodo's Web Scraping API into any OpenClaw-compatible AI agent or LLM pipeline. It exposes seven tools that agents can call directly:

  • google_search – query Google Search and receive structured JSON (organic results, AI overviews, paid, related questions, and more)
  • universal – fetch and parse any public webpage, returning clean Markdown
  • amazon – fetch parsed Amazon product-page data (e.g. ads, product info) by product URL
  • amazon_search – search Amazon by query; get parsed results (e.g. results list, delivery_postcode)
  • youtube_subtitles – fetch subtitles/transcript for a YouTube video (by video ID)
  • reddit_post – fetch a Reddit post’s content (by post URL)
  • reddit_subreddit – fetch a Reddit subreddit listing (by subreddit URL)

Backed by Decodo's residential and datacenter proxy infrastructure, the skill handles JavaScript rendering, bot detection bypass, and geo-targeting out of the box.

Features

  • Real-time Google Search results scraping
  • Universal URL scraping
  • Amazon product page parsing (by URL)
  • Amazon search (by query)
  • YouTube subtitles/transcript by video ID
  • Reddit post content by URL
  • Reddit subreddit listing by URL
  • Structured JSON or Markdown results
  • Simple CLI interface compatible with any OpenClaw agent runtime
  • Minimal dependencies — just Python with Requests
  • Authentication via a single Base64 token from the Decodo dashboard

Prerequisites

Setup

  1. Clone this repo.
git clone https://github.com/Decodo/decodo-openclaw-skill.git
  1. Install dependencies.
pip install -r requirements.txt
  1. Set your Decodo auth token as an environment variable (or create a .env file in the project root):
# Terminal
export DECODO_AUTH_TOKEN="your_base64_token"
# .env file
DECODO_AUTH_TOKEN=your_base64_token

OpenClaw agent integration

This skill ships with a SKILL.md file that defines all tools in the OpenClaw skill format. OpenClaw-compatible agents automatically discover and invoke the tools from this file without additional configuration.

To register the skill with your OpenClaw agent, point it at the repo root — the agent will read SKILL.md and expose google_search, universal, amazon, amazon_search, youtube_subtitles, reddit_post, and reddit_subreddit as callable tools.

Usage

Google Search

Search Google and receive structured JSON. Results are grouped by type: organic (main results), ai_overviews (AI-generated summaries), paid (ads), related_questions, related_searches, discussions_and_forums, and others depending on the query.

python3 tools/scrape.py --target google_search --query "your query"

Scrape a URL

Fetch and convert any webpage to clean Markdown file:

python3 tools/scrape.py --target universal --url "https://example.com/article"

Amazon product page

Fetch parsed data from an Amazon product page (e.g. ads, product details). Use the product URL:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"

Amazon search

Search Amazon and get parsed results (e.g. results list, delivery_postcode):

python3 tools/scrape.py --target amazon_search --query "laptop"

YouTube subtitles

Fetch subtitles/transcript for a YouTube video (use the video ID, e.g. from ?v=VIDEO_ID):

python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"

Reddit post

Fetch a Reddit post’s content (use the full post URL):

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/serious_next_day_thread_postgame_discussion/"

Reddit subreddit

Fetch a Reddit subreddit listing (use the subreddit URL):

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"

Related resources

Decodo Web Scraping API documentation

OpenClaw documentation

ClaWHub – OpenClaw skill registry

License

All code is released under the MIT License.

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages