online / endpoints 62 / categories 10 / rate 60/min/ip /

Crawler files

Well-formed counterparts to the chaotic crawler files on bots.catastrophic.io. Each is served at the same canonical well-known path, so a client built against not can flip the hostname to bots and exercise the chaos.

GET /.well-known/ai.txt

Well-formed ai.txt with non-contradictory directives. Allows the major real AI crawlers explicitly, sets no impossible delays, no fake bot names.

details

GET /.well-known/security.txt

RFC 9116 compliant security.txt. Expires field is in the future, Contact URL resolves, Canonical matches the served URL. Validators should accept without warnings.

details

GET /ads.txt

Minimal valid IAB ads.txt. Single authorized-seller line with a real-format domain, publisher account ID, and DIRECT relationship. Not operationally meaningful (catastrophic.io doesn't sell ads) but parses without complaint.

details

GET /humans.txt

Standard humanstxt.org format. /* TEAM */ and /* SITE */ sections, real-looking field values, no recursion, no time paradoxes.

details

GET /llms.txt

answer.ai-format llms.txt with a one-H1 heading, brief description, and only resolvable links. Lightweight version of the hub's llms.txt — points at this control surface, not the whole catalog.

details

GET /robots.txt

Standards-compliant robots.txt with a clean Allow-all, named explicit allows for major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider), and a sitemap reference. Mirrors the hub's operational robots.txt.

details

GET /sitemap.xml

Valid XML sitemap with sane recent lastmod dates and URLs that all resolve. Minimal — lists this host's well-known control endpoints, not an exhaustive crawl.

details

These are the well-formed counterparts to the chaotic crawler files on bots.catastrophic.io. Every endpoint is served at the same canonical path on both hosts — flip the hostname between not. and bots. to swap between the parses-cleanly and parses-with-errors versions.

Where the hub itself ships an operational version of the file (/robots.txt, /llms.txt, /sitemap.xml), the not version is modelled on the same content. The hub’s file is the production artefact; the not version exists so client and parser tests can hit a predictable, stable target without scraping a docs site.

Each response carries:

  • X-Chaos-Origin: control
  • X-Chaos-Counterpart: https://bots.catastrophic.io<same-path>

so a monitoring client can confirm it received the well-formed surface rather than the chaos surface.