web-scraping

Star

Here are 3,343 public repositories matching this topic...

scrapy / scrapy

Star

Scrapy, a fast high-level web crawling & scraping framework for Python.

python crawler framework scraping crawling web-scraping hacktoberfest web-scraping-python

Updated Jul 10, 2025
Python

Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monitoring—all for free or enjoy our SaaS plan!

Updated Jul 15, 2025
Python

ScrapeGraphAI / Scrapegraph-ai

Sponsor

Star

Python scraper based on AI

markdown crawler ai html-to-markdown web-crawler scraping web-scraping rag automated-scraper scraping-python web-crawlers llm ai-scraping

Updated Jul 3, 2025
Python

Evil0ctal / Douyin_TikTok_Download_API

Sponsor

Star

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

Updated Mar 23, 2025
Python

seleniumbase / SeleniumBase

Star

Python APIs for web automation, testing, and bypassing bot-detection.

Updated Jul 17, 2025
Python

mherrmann / helium

Star

Lighter web automation with Python

python firefox chrome webdriver selenium python3 web-scraping helium web-automation selenium-python

Updated Apr 28, 2025
Python

alirezamika / autoscraper

Sponsor

Star

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

python crawler machine-learning scraper automation ai scraping artificial-intelligence web-scraping scrape webscraping webautomation

Updated Jun 9, 2025
Python

D4Vinci / Scrapling

Sponsor

Star

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

Updated Jul 10, 2025
Python

apify / crawlee-python

Star

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

python crawler scraper automation web-crawler headless scraping crawling pip web-scraping beautifulsoup web-crawling hacktoberfest headless-chrome apify playwright

Updated Jul 17, 2025
Python

adbar / trafilatura

Sponsor

Star

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Updated May 30, 2025
Python

lexiforest / curl_cffi

Sponsor

Star

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.

http curl https http-client web-scraping fingerprinting ja3 tls-fingerprint curl-impersonate ja3-fingerprint http2-fingerprint akamai-fingerprint

Updated Jul 17, 2025
Python

snooppr / snoop

Star

Snoop — инструмент разведки на основе открытых данных (OSINT world)

Updated Jul 13, 2025
Python

lorien / grab

Star

Web Scraping Framework

python crawler framework spider asynchronous network python-library scraping crawling http-client python3 web-scraping pycurl urllib3

Updated Mar 12, 2024
Python

oxylabs / amazon-scraper

Star

Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.

Updated Jun 26, 2025
Python

vprusso / youtube_tutorials

Sponsor

Star

Collection of scripts corresponding to LucidProgramming YouTube tutorials

python python3 web-scraping youtube-tutorial python-tutorial ctci-solutions lucidprogramming python3-tutorial technical-interview

Updated Oct 26, 2022
Python

tinyfish-io / agentql

Star

AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.

javascript python agent automation web ai scraping web-scraping web-scrapping rpa playwright web-scraping-python web-scraping-javascript aiagent webagent web-scraping-colabs

Updated Jun 25, 2025
Python

kaliiiiiiiiii / Selenium-Driverless

Sponsor

Star

a stealthy browser automation framework

python testing automation webdriver reverse-engineering python3 web-scraping vulnerability-research scraping-python detection-evasion driverless-chrome

Updated Apr 25, 2025
Python

je-suis-tm / web-scraping

Star

Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

Updated Feb 1, 2022
Python

Kaliiiiiiiiii-Vinyzu / patchright-python

Star

Undetected Python version of the Playwright testing and automation library.

bot chrome automation webdriver browser bots chromium cloudflare web-scraping chromedriver stealth webscraping botting web-automation webautomation undetected undetectable playwright cloudflare-by

Updated Jun 6, 2025
Python

alecxe / scrapy-fake-useragent

Star

Random User-Agent middleware based on fake-useragent

python web-scraping scrapy

Updated Sep 18, 2023
Python

Improve this page

Add a description, image, and links to the web-scraping topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the web-scraping topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

web-scraping

Here are 3,343 public repositories matching this topic...

scrapy / scrapy

dgtlmoon / changedetection.io

ScrapeGraphAI / Scrapegraph-ai

Evil0ctal / Douyin_TikTok_Download_API

seleniumbase / SeleniumBase

mherrmann / helium

alirezamika / autoscraper

D4Vinci / Scrapling

apify / crawlee-python

adbar / trafilatura

lexiforest / curl_cffi

snooppr / snoop

lorien / grab

oxylabs / amazon-scraper

vprusso / youtube_tutorials

tinyfish-io / agentql

kaliiiiiiiiii / Selenium-Driverless

je-suis-tm / web-scraping

Kaliiiiiiiiii-Vinyzu / patchright-python

alecxe / scrapy-fake-useragent

Improve this page

Add this topic to your repo