In this paper, we present a config-driven architecture for autonomous web scraping in which a declarative JSON object fully specifies a scraping task: its sources, rendering mode, field selectors, pagination strategy, and post-processing transforms. A large language model generates these configs from natural-language descriptions and repairs them when execution fails, closing the loop without human intervention. The system exposes its capabilities as a Model Context Protocol (MCP) server, enabling AI agents to discover, configure, and execute scraping tasks autonomously. We describe the architecture, the config schema, the LLM integration, and the MCP binding, and evaluate the repair loop on a test suite of 20 tasks across four demo sites.
Hélder Monteiro (Sat,) studied this question.