Crawler files
Chaotic versions of the well-known files that crawlers, scanners, and AI agents fetch by canonical path.
An ai.txt — the AI-scraping equivalent of robots.txt — that contradicts itself, references AI bots that don't exist, or returns malformed directives.
An RFC 9116 security.txt with broken required fields: an Expires date in the past, dead Contact URLs, total nonsense in every field, or a Canonical that doesn't include the served URL.
An IAB ads.txt with internally contradictory authorized-seller declarations, fictitious ad networks, or malformed lines that ads.txt crawlers should reject.
A humans.txt that contradicts itself, recursively references itself for every field, or contains time-paradox dates.
An llms.txt (the proposed convention for telling LLMs about site structure) that lists pages that 404, contradicts itself, or embeds prompt-injection content. Tests whether AI agents that ingest llms.txt sanitise it before acting.
A robots.txt that contradicts itself, sets impossible crawl delays, or returns malformed directives. Crawlers that parse strictly should reject; lenient crawlers will produce unpredictable behaviour.
A sitemap that crawlers will follow into bad places: 404 URLs, future-dated lastmod values, circular sitemap-index references, or a body that claims gzip encoding but isn't gzipped.
These files are served at their canonical well-known paths on bots.catastrophic.io.
Point your crawler, scanner, or AI agent at this host and observe how it
handles metadata that real-world tooling rarely treats as untrusted.
Each file supports a ?mode= parameter to select between several flavours
of chaos, with a sensible default per file. All responses carry an
X-Chaos-*-Mode header reflecting the selection so monitoring clients can
verify which mode they received.