Crawler files
Well-formed counterparts to the chaotic crawler files on bots.catastrophic.io. Each is served at the same canonical well-known path, so a client built against not can flip the hostname to bots and exercise the chaos.
Well-formed ai.txt with non-contradictory directives. Allows the major real AI crawlers explicitly, sets no impossible delays, no fake bot names.
RFC 9116 compliant security.txt. Expires field is in the future, Contact URL resolves, Canonical matches the served URL. Validators should accept without warnings.
Minimal valid IAB ads.txt. Single authorized-seller line with a real-format domain, publisher account ID, and DIRECT relationship. Not operationally meaningful (catastrophic.io doesn't sell ads) but parses without complaint.
Standard humanstxt.org format. /* TEAM */ and /* SITE */ sections, real-looking field values, no recursion, no time paradoxes.
answer.ai-format llms.txt with a one-H1 heading, brief description, and only resolvable links. Lightweight version of the hub's llms.txt — points at this control surface, not the whole catalog.
Standards-compliant robots.txt with a clean Allow-all, named explicit allows for major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider), and a sitemap reference. Mirrors the hub's operational robots.txt.
Valid XML sitemap with sane recent lastmod dates and URLs that all resolve. Minimal — lists this host's well-known control endpoints, not an exhaustive crawl.
These are the well-formed counterparts to the chaotic crawler files on
bots.catastrophic.io. Every endpoint is served at the same canonical
path on both hosts — flip the hostname between not. and bots. to
swap between the parses-cleanly and parses-with-errors versions.
Where the hub itself ships an operational version of the file
(/robots.txt, /llms.txt, /sitemap.xml), the not version is
modelled on the same content. The hub’s file is the production
artefact; the not version exists so client and parser tests can hit a
predictable, stable target without scraping a docs site.
Each response carries:
X-Chaos-Origin: controlX-Chaos-Counterpart: https://bots.catastrophic.io<same-path>
so a monitoring client can confirm it received the well-formed surface rather than the chaos surface.