TOON Validator - TOON Format Converter & Validator

TOON Input

4 lines

0 tokens

Loading editor...

✓

Valid TOON Syntax

Your TOON is correctly formatted and can be decoded

⚙️

Quick Tips

✓

Use type annotations (:i, :b, :f) for integers, booleans, and floats

✓

Array syntax: arrayName[count]{ fields } :

✓

Separate values with commas, rows with newlines

✓

Keep lines under 200 characters for readability

View full spec →

Results

Valid

No issues

Your TOON document passed all checks.

📄Convert JSON to TOON

✨Convert TOON to JSON

The Ultimate Guide to TOON

TOON (Token-Oriented Object Notation) is a compact, deterministic, lossless representation of JSON designed for anyone working with LLM pipelines, structured data, or high-volume prompts. This document explains how TOON works, how to convert JSON to TOON and TOON to JSON, how to use a TOON validator, and how to reliably validate TOON format using strict structural guarantees.

1. What Is TOON Format?

TOON (Token-Oriented Object Notation) is a lossless alternative representation of JSON. It does not invent new data types. It does not alter semantics. It simply removes structural redundancy and uses syntax that is more efficient for LLMs to parse. TOON keeps:

objects
arrays
strings
numbers
booleans
null

The key difference: TOON expresses these elements using indentation and an optional "tabular array" notation that replaces repeated JSON field names with a schema declaration.

2. How TOON Works

TOON has two core mechanisms:

2.1 Indentation Instead of Braces

JSON:

{
  "user": {
    "id": 3,
    "name": "Ada"
  }
}

TOON:

user:
  id: 3
  name: Ada

No braces. No quotes unless required.

2.2 Tabular Arrays for Uniform Objects

When JSON contains an array of objects with identical keys:

[
  { "id": 1, "name": "Alice", "role": "admin" },
  { "id": 2, "name": "Bob",   "role": "user" }
]

TOON compresses this:

users[12]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Key features:

[12] = declared array length
{id,name,role} = schema
rows = field-aligned values

3. TOON vs JSON: Why Use TOON?

3.1 Efficiency

JSON repeats field names in every row. TOON does not. Token usage typically drops by 30–60%. On time-series workloads, reductions over 59% are common.

3.2 Accuracy in LLM outputs

Retrieval benchmarks show:

TOON accuracy: 73.9%
JSON accuracy: 69.7%

Models benefit from explicit structure and fewer distractor tokens.

3.3 Built-in Validation

TOON encodes its own schema:

expected row count
expected field count
required field order

A TOON validator catches malformed data instantly.

3.4 Minimal Quoting

Strings only require quotes when ambiguous. Reduces token noise and helps models focus.

4. TOON Format Documentation (Concise)

4.1 Objects

Indentation defines structure:

settings:
  enabled: true
  retries: 3

4.2 Primitive Arrays

names[3]: mei,alicia,jamal

4.3 Tabular Arrays

items[4]{id,title,price}:
  1,Book,12.50
  2,Pen,1.20
  3,Notebook,4.75
  4,Map,9.00

4.4 Delimiters

Use:

comma
tab
pipe

Tabs tokenize best.

4.5 Quoting Rules

Quote only when:

the string is empty
contains special characters
starts/ends with whitespace
resembles numbers or booleans

5. TOON Examples

Example: Mixed Structure

config:
  version: 1
  owners[2]: chen,mina

servers[3]{id,host,port}:
  1,api,443
  2,cache,6379
  3,worker,9000

Example: Log Export

export:
  generatedAt: 2025-02-10

entries[3]{id,status,latencyMs}:
  1,ok,12
  2,ok,14
  3,fail,200

6. Validation: Using a TOON Validator

A TOON validator examines:

indentation correctness
strict delimiter usage
valid quoting
correct [N] row count
correct field count for each row
valid primitive types

Because TOON encodes its own schema, validation is deterministic.

Common failure cases caught by validators:

a row missing a column
an extra column in a row
missing rows (truncation)
stray indentation
malformed field names

7. JSON ↔ TOON Conversion

7.1 JSON to TOON

Use:

CLI converters
library calls
convert json to toon online tools

This is commonly done before inserting structured data into an LLM prompt.

7.2 TOON to JSON

Decoders auto-detect the format.

This supports:

reading LLM output
reintegrating structured data
downstream processing

7.3 Determinism

decode(encode(x)) returns normalized JSON.

Non-JSON values (NaN, dates, bigints) normalize as JSON-compatible types.

8. When TOON Should Be Used

Best-fit scenarios:

large uniform arrays
structured tabular data
datasets intended for LLM ingestion
reproducible evaluation benchmarks
applications that require strict schema adherence

Example high-value use cases:

time-series analytics
embeddings metadata
evaluation datasets
multi-item reasoning workloads
synthetic dataset generation

9. When TOON Isn't Ideal

Avoid TOON when:

the data structure is deeply nested
uniformity is low (mixed objects)
CSV is sufficient (purely flat tables)
CPU-bound latency is higher priority than token count

TOON's sweet spot is uniform arrays with primitive fields.

10. Real Engineering Notes

TOON is valuable because:

LLMs emit it more reliably than JSON
Validation is far more strict
Tabular arrays reduce hallucinations
It avoids "JSON fixing" scripts
It integrates into pipelines with minimal friction
It is human-readable but machine-strict
It acts as a drop-in compression layer over JSON without semantic loss