AgentStack
Back to directory

Trafilatura

Free
6.0k GitHub stars
Agent ToolAgnosticWeb Scraper

Overview

Trafilatura is a Python and command-line tool designed for gathering text and metadata from the web through crawling and scraping. It is ideal for researchers and developers looking to extract and process web content in various formats such as CSV, JSON, and HTML.

Visit resource