TOON Input
4 lines
0 tokens
Loading editor...
Valid TOON Syntax
Your TOON is correctly formatted and can be decoded
⚙️
Quick Tips
Use type annotations (:i, :b, :f) for integers, booleans, and floats
Array syntax: arrayName[count]{ fields } :
Separate values with commas, rows with newlines
Keep lines under 200 characters for readability
Results
Valid
No issues
Your TOON document passed all checks.
📄Convert JSON to TOON
Convert TOON to JSON

The Ultimate Guide to TOON

TOON (Token-Oriented Object Notation) is a compact, deterministic, lossless representation of JSON designed for anyone working with LLM pipelines, structured data, or high-volume prompts. This document explains how TOON works, how to convert JSON to TOON and TOON to JSON, how to use a TOON validator, and how to reliably validate TOON format using strict structural guarantees.

1. What Is TOON Format?

TOON (Token-Oriented Object Notation) is a lossless alternative representation of JSON. It does not invent new data types. It does not alter semantics. It simply removes structural redundancy and uses syntax that is more efficient for LLMs to parse. TOON keeps:

  • objects
  • arrays
  • strings
  • numbers
  • booleans
  • null

The key difference: TOON expresses these elements using indentation and an optional "tabular array" notation that replaces repeated JSON field names with a schema declaration.

2. How TOON Works

TOON has two core mechanisms:

2.1 Indentation Instead of Braces

JSON:

{
  "user": {
    "id": 3,
    "name": "Ada"
  }
}

TOON:

user:
  id: 3
  name: Ada

No braces. No quotes unless required.

2.2 Tabular Arrays for Uniform Objects

When JSON contains an array of objects with identical keys:

[
  { "id": 1, "name": "Alice", "role": "admin" },
  { "id": 2, "name": "Bob",   "role": "user" }
]

TOON compresses this:

users[12]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Key features:

  • [12] = declared array length
  • {id,name,role} = schema
  • rows = field-aligned values

3. TOON vs JSON: Why Use TOON?

3.1 Efficiency

JSON repeats field names in every row. TOON does not. Token usage typically drops by 30–60%. On time-series workloads, reductions over 59% are common.

3.2 Accuracy in LLM outputs

Retrieval benchmarks show:

  • TOON accuracy: 73.9%
  • JSON accuracy: 69.7%

Models benefit from explicit structure and fewer distractor tokens.

3.3 Built-in Validation

TOON encodes its own schema:

  • expected row count
  • expected field count
  • required field order

A TOON validator catches malformed data instantly.

3.4 Minimal Quoting

Strings only require quotes when ambiguous. Reduces token noise and helps models focus.

4. TOON Format Documentation (Concise)

4.1 Objects

Indentation defines structure:

settings:
  enabled: true
  retries: 3

4.2 Primitive Arrays

names[3]: mei,alicia,jamal

4.3 Tabular Arrays

items[4]{id,title,price}:
  1,Book,12.50
  2,Pen,1.20
  3,Notebook,4.75
  4,Map,9.00

4.4 Delimiters

Use:

  • comma
  • tab
  • pipe

Tabs tokenize best.

4.5 Quoting Rules

Quote only when:

  • the string is empty
  • contains special characters
  • starts/ends with whitespace
  • resembles numbers or booleans

5. TOON Examples

Example: Mixed Structure

config:
  version: 1
  owners[2]: chen,mina

servers[3]{id,host,port}:
  1,api,443
  2,cache,6379
  3,worker,9000

Example: Log Export

export:
  generatedAt: 2025-02-10

entries[3]{id,status,latencyMs}:
  1,ok,12
  2,ok,14
  3,fail,200

6. Validation: Using a TOON Validator

A TOON validator examines:

  • indentation correctness
  • strict delimiter usage
  • valid quoting
  • correct [N] row count
  • correct field count for each row
  • valid primitive types

Because TOON encodes its own schema, validation is deterministic.

Common failure cases caught by validators:

  • a row missing a column
  • an extra column in a row
  • missing rows (truncation)
  • stray indentation
  • malformed field names

7. JSON ↔ TOON Conversion

7.1 JSON to TOON

Use:

  • CLI converters
  • library calls
  • convert json to toon online tools

This is commonly done before inserting structured data into an LLM prompt.

7.2 TOON to JSON

Decoders auto-detect the format.

This supports:

  • reading LLM output
  • reintegrating structured data
  • downstream processing

7.3 Determinism

decode(encode(x)) returns normalized JSON.

Non-JSON values (NaN, dates, bigints) normalize as JSON-compatible types.

8. When TOON Should Be Used

Best-fit scenarios:

  • large uniform arrays
  • structured tabular data
  • datasets intended for LLM ingestion
  • reproducible evaluation benchmarks
  • applications that require strict schema adherence

Example high-value use cases:

  • time-series analytics
  • embeddings metadata
  • evaluation datasets
  • multi-item reasoning workloads
  • synthetic dataset generation

9. When TOON Isn't Ideal

Avoid TOON when:

  • the data structure is deeply nested
  • uniformity is low (mixed objects)
  • CSV is sufficient (purely flat tables)
  • CPU-bound latency is higher priority than token count

TOON's sweet spot is uniform arrays with primitive fields.

10. Real Engineering Notes

TOON is valuable because:

  • LLMs emit it more reliably than JSON
  • Validation is far more strict
  • Tabular arrays reduce hallucinations
  • It avoids "JSON fixing" scripts
  • It integrates into pipelines with minimal friction
  • It is human-readable but machine-strict
  • It acts as a drop-in compression layer over JSON without semantic loss