online / endpoints 59 / categories 14 / rate 60/min/ip /

Format chaos

Common-Internet-file-format misbehaviour: bytes that look like the right format but break in specific structural ways.

GET /activitystreams

Returns ActivityStreams 2.0 / ActivityPub objects with spec violations. Default omits the required @context. Use ?mode= to isolate other violations: a type outside the AS2 vocabulary, a non-URI actor, or an object whose shape doesn't match its activity type. Consumed by Mastodon, Misskey, and other Fediverse servers during federation.

mode context-missing (default; @context absent; ActivityPub §3.1 requires it; servers reject or fail to expand JSON-LD terms), type-not-vocabulary (`type` is "ChaosPost", not in AS2; servers that validate type drop the activity, those that forward propagate the invalid object), actor-not-uri (`actor` is a bare string; ActivityPub §4.1 requires a dereferenceable URI; key verification fetches fail), object-shape-variance (Like activity with `object` as a bare string instead of a URI or embedded object; servers that dereference get a parse error).

details

GET /atom

Returns Atom 1.0 feeds (RFC 4287) with spec violations. Default omits the required entry id element. Use ?mode= to isolate other violations: wrong date format, missing author, or incorrect rel=self link.

mode Which violation to send. One of: entry-id-missing (default; entry <id> absent, breaking aggregator deduplication), updated-wrong-format (<updated> uses RFC 822 instead of required RFC 3339), author-missing (no <author> on entry or feed level), link-rel-self-wrong (<link rel="self"> href points to a different host).

details

GET /cloudevents

Returns CloudEvents 1.0 envelopes with spec violations. Default claims specversion 0.3 instead of 1.0. Use ?mode= to isolate other violations: `type` as an integer, a datacontenttype mismatch, or both `data` and `data_base64` present.

mode specversion-lie (default; `specversion` is "0.3" instead of "1.0"; consumers that validate version before dispatching reject the envelope or apply 0.3 parsing rules), type-as-number (`type` is integer 42; CloudEvents §3.1 requires a non-empty string; consumers that substring-match on type throw a type error), datacontenttype-mismatch (`datacontenttype` says application/json but `data` is a bare string; consumers call JSON.parse on non-JSON and throw), data-base64-confusion (both `data` and `data_base64` present; CloudEvents §3.1 says mutually exclusive; consumers disagree on which field is authoritative).

details

GET /csv

Returns CSV with RFC 4180 corner-case violations: unquoted commas, embedded newlines, UTF-8 BOM at start, ragged column counts. Targets data pipelines, BI tools, and ML preprocessing — anywhere CSV is the lingua franca and parsers vary in strictness.

mode unquoted-commas (default; row contains an unquoted comma in a field — appears to have 4 columns instead of 3), embedded-newlines (row contains a literal CRLF in an unquoted field — naive splitters break mid-row), bom-mismatch (UTF-8 BOM at start of stream — strict parsers include the BOM in cell[0] of the header row, breaking column lookups by name), ragged-columns (header declares 3 columns but rows have 2 and 4 — parsers either raise, pad with nulls, or silently misalign).

details

GET /geojson

Returns GeoJSON documents with RFC 7946 violations. Default returns a Polygon whose ring does not close. Use ?mode= to isolate other violations: lat/lon axis swap, type/coordinates mismatch, or absent `properties` member.

mode polygon-not-closed (default; Polygon ring last position differs from first; RFC 7946 §3.1.6 requires an identical closing position; strict validators reject, lenient renderers auto-close silently), coordinates-swapped (coordinates are in [latitude, longitude] order instead of GeoJSON's required [longitude, latitude]; geometry is syntactically valid so parsers accept it and place the point in the wrong hemisphere), type-coords-mismatch (`type` is "Point" but `coordinates` is an array of rings; strict validators reject, lenient renderers produce NaN coordinates), properties-missing (`properties` member absent from the Feature; RFC 7946 §3.2 requires it even if null; libraries that access feature.properties.name throw).

details

GET /html

Returns HTML documents with deliberate structural flaws. Useful for testing browser parsers, scraper robustness, and ML preprocessing pipelines that ingest HTML. Default mode produces visible misnesting on a bare GET; other modes target charset, doctype, or unterminated-tag handling.

mode mismatched-tags (default; `<div><span></div></span>` — HTML5 parsers auto-correct, strict XML parsers reject), meta-charset-conflict (`<meta charset="utf-8">` declared but body bytes are CP1252 — strict utf-8 parsers replace or fail on the smart-quote / dash / euro bytes), doctype-mismatch (XHTML 1.0 Strict doctype with HTML5 syntax like `<br>` — strict XHTML parsers reject), script-without-end (`<script>` tag opens and never closes — browsers swallow the rest of the document into the script body).

details

GET /image

Image responses where the declared MIME type, magic bytes, embedded dimensions, chunk framing, or format metadata disagree with the actual content. Seven modes across PNG, JPEG, WebP, GIF, and AVIF test how image pipelines handle metadata-vs-bytes mismatches.

mode Which flaw to send. mime-mismatch (default; real PNG bytes served as image/jpeg), magic-byte-lie (PNG signature prefix on a JPEG body), wrong-dimensions (1x1 PNG with IHDR claiming 4000x4000), truncated-png (under-delivered IDAT, no IEND), webp-flag-lie (VP8X flags claim Animation but no ANIM/ANMF chunks follow), gif-lsd-lie (GIF89a LSD claims 100×100 canvas, image frame is 1×1), avif-ftyp-lie (ftyp box brands avif, no meta or mdat boxes follow).

details

GET /json

Returns syntactically invalid JSON with a Content-Type of application/json. Default bundles three flaws in one payload (missing closing brace, unquoted key, trailing comma). Use ?mode= to isolate a single failure mode for targeted parser testing.

mode Which flaw to send. One of: all (default; three flaws bundled), missing-brace, unquoted-key, trailing-comma.

details

GET /json-feed

Returns JSON Feed 1.1 documents with semantic violations that parse as valid JSON but break spec compliance. Default sends a bare version string instead of the required URL. Use ?mode= to isolate other violations.

mode Which violation to send. One of: version-mismatch (default; `version` is "1.1" instead of the full URL "https://jsonfeed.org/version/1.1"), items-id-not-unique (two items share the same `id`), item-url-malformed (an item `url` is a relative path instead of an absolute URL), feed-url-wrong (`feed_url` points to a different domain).

details

GET /jsonapi

Returns JSON:API responses with spec-level violations. Default returns `data` as an array when a single resource is expected. Use ?mode= to isolate other violations: orphaned included resources, missing `id`, or missing `type`.

mode data-shape-confusion (default; `data` is an array when a single resource is implied; clients that call response.data.attributes receive undefined), included-orphan (`included` contains people/99 not referenced by any relationship in `data`; JSON:API §5 requires all included resources to be linked), missing-id (resource object has no `id`; JSON:API §3.1 requires it for server-assigned IDs), type-missing (resource object has no `type`; JSON:API §3.1 requires it; clients that dispatch on type throw or fall through).

details

GET /jsonl

Returns JSON Lines (NDJSON) streams with stream-level violations. Targets ML pipelines, log aggregators, and AI chat exports — anywhere line-delimited JSON is consumed. Default mode is schema drift mid-stream; other modes cover blank lines, partial final records, and BOM start.

mode schema-drift (default; records 1–3 use {name, email}, records 4–5 use {full_name, contact.email}; schema-aware pipelines null-coerce or reject), blank-lines (empty lines between records; strict per-line parsers raise on the empty string), partial-final-line (final record truncated mid-object with no trailing newline; streaming parsers raise on EOF mid-record), bom-start (UTF-8 BOM at start of stream; strict per-line parsers see `\uFEFF{...` and raise on the first record).

details

GET /jsonrpc

Returns JSON-RPC 2.0 responses with spec-level violations. Default omits the required `jsonrpc` version field. Use ?mode= to isolate other violations: mismatched response id, both result and error present, or a reserved-but-unassigned error code.

mode version-missing (default; `jsonrpc` field absent; signals a 1.x response with a different envelope; strict clients reject), id-mismatch (response `id` is 99 but request id was 1; clients correlating by id fail to match), result-and-error (both `result` and `error` present; JSON-RPC 2.0 §5 says mutually exclusive; clients checking only one field miss the other), error-code-invalid (`error.code` is -32001, inside the reserved range but not a standard code; clients mapping codes to exception types fall through to unknown-error).

details

GET /jwt-payload

Returns JWT payload (claims) objects with RFC 7519 violations. Default sends `exp` as a string instead of a NumericDate. Use ?mode= to isolate other violations: malformed iss, mixed-type aud array, or a reserved claim with the wrong type. Distinct from /jwt (token-validation flaws) and /auth (challenge framing).

mode exp-as-string (default; `exp` is a string instead of a NumericDate integer per RFC 7519 §4.1.4; typed SDKs throw on deserialization), iss-malformed (`iss` is a bare word, not a URI; validators that require URI format reject), aud-mixed-types (`aud` array contains string, integer, and boolean; RFC 7519 §4.1.3 requires StringOrURI), reserved-claim-collision (`sub` is an object instead of a string principal identifier; libraries that read sub as string throw or stringify to "[object Object]").

details

GET /multipart

Returns a multipart/form-data response with RFC 7578 parser-adversity. Default declares one boundary in the Content-Type and uses a different one in the body. Use ?mode= to isolate other violations: missing terminating CRLF, ambiguous Content-Disposition `name=` parameters, or nested multipart bodies.

mode boundary-mismatch (default; Content-Type declares one boundary but the body uses another; strict parsers find no opening delimiter and return zero parts, lenient parsers may scan and recover), trailing-crlf-confusion (closing boundary `--boundary-xyz--` has no trailing CRLF; strict parsers treat the stream as unterminated, lenient parsers accept), disposition-name-injection (Content-Disposition has two `name=` parameters; RFC 7578 does not define precedence and parsers disagree on which wins), nested-multipart (inner part uses Content-Type: multipart/mixed with its own boundary; most form-data parsers do not recurse into nested multipart and return the raw bytes or error).

details

GET /oauth-token

Returns RFC 6749 §5.1 token responses with schema violations. Default returns `expires_in` as a string instead of an integer. Use ?mode= to isolate other violations: comma-delimited scope, nonstandard token_type, or id_token without the openid scope.

mode expires-in-type-shift (default; `expires_in` is the string "3600" instead of integer 3600; RFC 6749 §5.1 requires a number; typed SDKs error, expiry arithmetic produces NaN), scope-delimiter-wrong (`scope` uses comma separators instead of the space separators required by RFC 6749 §3.3; clients splitting on space get the full string as one token), token-type-nonstandard (`token_type` is "token" instead of "Bearer"; clients that exact-match fail to attach Authorization headers), id-token-without-scope (`id_token` present but `scope` omits "openid"; OIDC Core §3.1.3.3 requires the openid scope; OIDC clients reject the response).

details

GET /ooxml

Office Open XML container responses (DOCX, XLSX, PPTX) where one structural part of the package deliberately lies — wrong content type declaration, missing referenced part, dangling image relationship, or [Content_Types].xml claiming the package is multiple formats simultaneously. Macro-enabled containers, encrypted OOXML, and embedded OLE objects are deliberately not served — see the note below.

format docx (default; word/document.xml main part), xlsx (xl/workbook.xml + sheet1), pptx (full slide-master/slide-layout/theme triangle). Body content is identical hello-world text; only the container wrapping changes.
mode wrong-content-type (default; [Content_Types].xml advertises the main part as another OOXML format's content type), missing-part (root .rels references a part path that isn't in the ZIP; the part is at a typo'd path), dangling-relationship (main part's .rels declares an image relationship to media/image1.png that isn't in the package), format-confusion ([Content_Types].xml declares main parts for all three OOXML formats; only the current format's tree is present).

details

GET /pdf

Returns a structurally flawed PDF where one element of the object graph deliberately lies — wrong xref byte offsets, a page tree claiming 100 pages when only one exists, a trailer referencing a nonexistent encryption dictionary, or a JavaScript OpenAction that fires on open. Default mode ships wrong xref offsets, the most fundamental structural lie.

mode bad-xref (default; xref byte offsets all shifted +100 bytes; object lookup by offset fails), page-count-lie (/Count declares 100 pages; only one page object exists), encrypted-ghost (trailer /Encrypt references object 9 which is not defined), javascript-action (Catalog OpenAction executes app.alert("pdf-chaos") on open; JS-enabled readers fire the alert, JS-disabled readers open silently, AV scanners may flag any JS action).

details

GET /problem-details

Returns RFC 9457 Problem Details objects with spec violations. Default drops the required `type` field. Use ?mode= to isolate other violations: `status` as a string, a non-URI `instance`, or a Content-Type that claims application/json instead of application/problem+json.

mode drop-required (default; `type` field omitted; RFC 9457 §3 requires it; clients that branch on type for error routing receive undefined), status-type-shift (`status` is the string "404" instead of integer 404; typed SDKs that deserialize into an int field error), instance-malformed (`instance` is the bare word "not-a-uri"; clients that dereference it get a URL parse error), content-type-lie (body is valid Problem Details JSON but Content-Type is application/json; clients that dispatch on the media type miss the error envelope).

details

GET /rss

Returns RSS 2.0 feeds with spec violations. Default omits the required channel title. Use ?mode= to isolate other violations: wrong date format, non-URL guid marked as permalink, or enclosure MIME type lie.

mode Which violation to send. One of: missing-channel-title (default; required channel <title> absent), item-pubdate-wrong-format (<pubDate> uses ISO 8601 instead of required RFC 822), guid-not-permalink (<guid isPermaLink="true"> is a plain string not a URL), enclosure-type-lie (<enclosure type="audio/mpeg"> pointing at an HTML page).

details

GET /schema-org

Returns Schema.org / JSON-LD documents with vocabulary and context violations. Default points @context at a non-existent IRI. Use ?mode= to isolate other violations: unknown @type, missing required properties, or a context array that shadows a schema.org term.

mode context-lie (default; @context points at https://schema.example.invalid; JSON-LD processors that dereference get a 404, those that match by IRI treat properties as unknown terms), type-unknown (@type is "ChaosEntity", not in the vocabulary; rich-result renderers reject), missing-required (Recipe without name or recipeIngredient; search crawlers silently drop the rich result), context-array-conflict (second @context entry remaps "name" to a custom IRI, shadowing schema.org's definition; strict JSON-LD applies the last definition).

details

GET /sse

Returns a Server-Sent Events stream with WHATWG EventSource spec violations. Default sends payload lines with no `data:` prefix. Use ?mode= to isolate other violations: named events with no listeners, discontinuous id ordering, or mixed CRLF/LF line endings.

mode missing-data-prefix (default; payload lines have no `data:` prefix; conforming EventSource clients ignore unrecognized field names and deliver empty event.data to handlers), event-without-listener (all events use named `event:` fields; EventSource.onmessage never fires; consumers need addEventListener for each event type), id-discontinuity (event ids go backwards `5, 3, 5, 1`; Last-Event-ID reconnect logic receives an unpredictable cursor and some clients reject out-of-order ids), crlf-mix (line endings alternate CRLF and LF within the same stream; parsers that normalize only one style misparse field boundaries).

details

GET /svg

Returns SVG documents that misbehave for browser renderers, image processors, and XML-strict parsers. SVG is an XML+rendering hybrid so chaos modes span both worlds: malformed XML structure, external-resource references, recursive <use> chains, and non-conformant namespaces.

mode mismatched-tags (default; `<g>` closes before its child `<rect>` — XML parsers reject, lenient renderers recover), external-image-ref (`<image href="http://nonexistent.invalid/...">`; processors that resolve external refs hang), circular-use (`<use>` chain references itself; non-conformant renderers may recurse), bad-namespace (xmlns declares a non-SVG namespace; namespace-aware processors refuse to render).

details

GET /toml

Returns TOML documents with TOML 1.0 spec violations and parser ambiguity. Default mixes int, string, and float in a single array. Use ?mode= to isolate other violations: unquoted datetime, duplicate tables, illegal nesting, impossible offsets, dotted-key/table collisions, or trailing commas in inline tables.

mode mixed-array-types (default; array mixes int, string, and float; TOML 1.0 forbids mixed types in arrays; many parsers accept silently and downstream schema validators fail), unquoted-datetime (datetime value is unquoted and not in RFC 3339 format), duplicate-table (the same table is defined twice), invalid-nesting (a table is defined after it has been used as a simple key), datetime-offset-lie (offset +99:00 is structurally valid TOML datetime syntax but semantically impossible; strict parsers with timezone validation reject, others re-emit the garbage offset), dotted-key-table-collision (table [server] is defined first, then server.host is re-defined as a dotted key; TOML 1.0 forbids re-defining a table via dotted keys), trailing-comma-inline-table (inline table has a trailing comma after the last key-value pair; TOML 1.0 forbids this unlike arrays; many parsers accept it silently).

details

GET /web-annotation

Returns W3C Web Annotation objects with spec violations. Default omits the required `id` field. Use ?mode= to isolate other violations: wrong body type, invalid motivation vocabulary, or inconsistent target shape.

mode Which violation to send. One of: id-missing (default; required `id` field absent), body-as-string (`body` is a plain string instead of a TextualBody object or URI), motivation-invalid (`motivation` is not in the W3C vocabulary), target-shape-variance (`target` is a Specific Resource object instead of a URI string).

details

GET /xml

Returns XML documents with parser-attack or well-formedness flaws. Targets SOAP, B2B, RSS/sitemap, and config-file ingestion pipelines. Default mode is visible misnesting; other modes cover entity-expansion DoS, external-entity references, and declaration-vs-body encoding mismatch.

mode mismatched-tags (default; `<alpha><beta></alpha></beta>` — every conformant parser rejects), billion-laughs (3 levels of self-similar entity expansion; ~3KB expanded; hardened parsers reject), xxe-external (external entity referencing nonexistent.invalid per RFC 6761; parsers with external-entity resolution hang or fail), encoding-mismatch (declaration claims utf-8 but body bytes are CP1252; strict parsers fail, lenient produce mojibake).

details

GET /yaml

Returns YAML documents that exploit the YAML 1.1 vs 1.2 split and parser ambiguity. Default sends the Norway problem: bare `NO/YES/ON/OFF` scalars that YAML 1.1 parsers coerce to booleans while YAML 1.2 parsers leave as strings. Use ?mode= to isolate other violations: recursive anchors, duplicate keys, or tag lies.

mode norway-problem (default; bare NO/YES/ON/OFF scalars; YAML 1.1 parsers like Go yaml.v2 and older PyYAML coerce to booleans, YAML 1.2 parsers leave as strings; mixed-version pipelines disagree on the type of every value), anchor-cycle (recursive anchor reference `&a` referencing `*a`; parsers without cycle detection follow references until stack or heap is exhausted), duplicate-keys (key `host` appears twice; spec leaves behavior application-defined; parsers variously take first, take last, or error; schema validators almost always reject), tag-lie (explicit `!!int` tag applied to the string `three-point-five`; strict parsers raise a tag resolution error, loose parsers store the raw value and continue silently).

details

GET /zip

ZIP archive responses with a deliberate structural flaw — wrong CRC32, central-directory-vs-local-header disagreement, ZIP64 size lie, or truncated central directory. Tests how archive extractors, package managers, and scanners handle metadata-vs-bytes contradictions. Zip-slip and zip-bomb are deliberately not served — see the note below.

mode bad-crc (default; LFH/CDH declare CRC32 0xDEADBEEF, body's real CRC differs), central-dir-mismatch (LFH names alpha.txt, CDH names beta.txt for the same offset and body), zip64-lie (sizes use the 0xFFFFFFFF ZIP64 sentinel and the ZIP64 extra field claims 8 GiB; body is ~40 bytes), truncated (LFH + body intact, central directory cut off mid-record, no EOCD).

details

Endpoints that return responses in real file formats — HTML, JSON, XML, CSV, SVG, NDJSON, PNG/JPEG, ZIP, OOXML — where one structural field deliberately lies. Tests how format-specific parsers, content sniffers, and ingestion pipelines handle metadata-vs-bytes contradictions, malformed structures, and ambiguous archives.

Each endpoint exposes a mode parameter that picks the specific flaw; a plain-text X-Chaos-*-Note header explains what’s wrong with the current response. Cache-Control: no-transform is set so the CDN edge does not sanitise or re-encode the broken bytes on the way out.