Token-Oriented Object Notation – Associative arrays and JSON for LLMs at half the token cost. this is a PHP port of johannschopplich/toon.

Installs: 8

Dependents: 0

Suggesters: 0

Security: 0

Stars: 2

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/abdelhamiderrahmouni/toon-php

v0.1.0 2025-10-29 03:23 UTC

This package is auto-updated.

Last update: 2025-10-29 11:23:46 UTC


README

Toon PHP encodes PHP associative arrays and objects into the TOON format (Token‑Oriented Object Notation). It’s built first for PHP data structures—arrays and plain objects—and also works with JSON inputs by decoding them before encoding.

It favors:

  • Inline arrays for primitives
  • Tabular rows for uniform arrays of objects
  • Safe string quoting to avoid ambiguity
  • Customizable indentation, delimiters, and an optional array length marker

This makes structured PHP data easy for humans to scan while staying unambiguous for machines (and token‑efficient for LLMs).

Installation

Install via Composer (replace with your package name once published):

composer require abdelhamiderrahmouni/toon-php

Quick Start (arrays and objects)

<?php
declare(strict_types=1);

use function Toon\encode;

$input = [
  'name' => 'A',
  'age' => 30,
  'tags' => ['x', 'y', 'z'],
];

echo encode($input);

Output:

name: A
age: 30
tags:[3]: x,y,z

Works with plain objects too (stdClass, public props):

<?php
declare(strict_types=1);

use function Toon\encode;

$object = (object) [
  'name' => 'A',
  'age' => 30,
  'tags' => ['x', 'y'],
];

echo encode($object);

Options:

<?php
declare(strict_types=1);

use function Toon\encode;
use Toon\Types\EncodeOptions;

$opts = new EncodeOptions(
    indent = 4,         // default 2
    delimiter = "\t",   // default ","
    lengthMarker = '#', // default false
);

echo encode(['tags' => [1, 2, 3]], $opts);

Output:

tags:[#3\t]: 1\t2\t3

Working with JSON (optional)

If your data starts as JSON, decode it first, then pass the PHP value to encode(). Arrays/objects are the primary path; JSON is just another input source.

<?php
declare(strict_types=1);

use function Toon\encode;

$json = '{"name":"A","age":30,"tags":["x","y","z"]}';

// Safest: throw on JSON errors and decode to associative arrays
try {
    $data = json_decode($json, true, flags: JSON_THROW_ON_ERROR);
    echo encode($data);
} catch (JsonException $e) {
    // Handle invalid JSON
    echo "Invalid JSON: " . $e->getMessage();
}

Reading from a file:

<?php
use function Toon\encode;

$contents = file_get_contents('data.json');
if ($contents === false) {
    throw new RuntimeException('Failed to read data.json');
}

$data = json_decode($contents, true, flags: JSON_THROW_ON_ERROR);

echo encode($data);

Notes:

  • Using true (associative) is convenient, but objects (stdClass) also work — normalization handles both.
  • Non-finite numbers and other PHP-specific values are normalized as described below.

Key Features

  • Objects render as key: value pairs with indentation for nesting.
  • Arrays:
    • Primitive arrays inline: tags:[3]: a,b,c
    • Array of objects with uniform keys and primitive values renders as a table:
      rows:[2]{id,name}:
        1,a
        2,b
      
    • Mixed arrays and complex structures fall back to list items:
      items:[4]:
        - 1
        - [2] a,b
        - x: 1
        - z
      
  • Safe string encoding: strings that look ambiguous (booleans, null, numeric-like, contain structural characters, etc.) are quoted and escaped.
  • Custom delimiter for inline arrays/tabular rows: comma (default), tab, or pipe (or any string).
  • Optional length marker # to prefix array lengths in headers: [#3].

Normalization Semantics

Toon PHP normalizes input into a JSON-like shape before encoding:

  • Scalars:
    • nullnull
    • bool, int, float preserved, with:
      • -0.0 canonicalized to 0
      • INF, -INF, NAN normalized to null
  • DateTimeInterface → ISO 8601 string (DATE_ATOM)
  • Stringable → cast to string
  • JsonSerializablejsonSerialize() result is normalized recursively
  • Enums:
    • BackedEnum → backing value (then normalized)
    • UnitEnum → enum name
  • Arrays:
    • List arrays (0..n-1 integer keys) are treated as arrays
    • Associative arrays are treated as objects (maps)
  • Traversable → array via iterator_to_array
  • Objects → associative arrays of public properties
  • Unsupported types (resources, closures) → null

Note: PHP has no BigInt primitive; nothing special is needed here (unlike the TS version).

Output Rules and Examples

Objects

encode(['user' => ['name' => 'A', 'age' => 30]]);
user:
  name: A
  age: 30
  • Keys are unquoted if matching /^[A-Z_][\w.]*$/i; otherwise quoted.
  • Empty objects render as key: on a line by itself.

Primitive Arrays (Inline)

encode(['tags' => ['x', 'y', 'z']]);
tags:[3]: x,y,z
  • Header syntax: key:[length<delimiter-if-not-default>]
  • A space separates the header and the joined values when non-empty.

Arrays of Arrays (Expanded List)

encode(['rows' => [[1, 2], [3]]]);
rows:[2]:
  - [2]: 1,2
  - [1]: 3

Arrays of Objects (Tabular)

Uniform keys and primitive values:

encode([
  'rows' => [
    ['id' => 1, 'name' => 'a'],
    ['name' => 'b', 'id' => 2],
  ],
]);
rows:[2]{id,name}:
  1,a
  2,b

Otherwise, fall back:

encode([
  'rows' => [
    ['id' => 1, 'name' => 'a'],
    ['id' => 2], // different shape
  ],
]);
rows:[2]:
  - id: 1
    name: a
  - id: 2

Mixed Arrays

encode([
  'items' => [1, ['a', 'b'], ['x' => 1], 'z'],
]);
items:[4]:
  - 1
  - [2]: a,b
  - x: 1
  - z

String and Key Encoding

  • Strings are left unquoted only if “safe.” They are quoted if:
    • Empty or padded with whitespace
    • Equal to true, false, or null
    • Numeric-like (e.g., 42, -3.14, 1e-6, 05)
    • Contain :, quotes, backslashes, brackets/braces, control characters (\n, \r, \t)
    • Contain the active delimiter
    • Start with - (list marker)
  • Strings are escaped for \\, ", newline, carriage return, and tab.
  • Keys are unquoted when they match /^[A-Z_][\w.]*$/i, otherwise quoted.

API

  • Namespace: Toon

Function

  • function encode(mixed $input, ?EncodeOptions $options = null): string

    Encodes normalized input to the Toon format.

Options

use Toon\Types\EncodeOptions;

$opts = new EncodeOptions(
    indent = 2,         // default: 2
    delimiter = ",",    // default: ","
    lengthMarker = '#', // default: false
);
  • indent — spaces per indentation level
  • delimiter — used for inline arrays and tabular rows (e.g., ",", "\t", "|")
  • lengthMarker'#' to render headers like [#N], or false to omit

Testing

This repo includes Pest tests that mirror the TypeScript behavior.

Run tests:

composer install
vendor/bin/pest

Differences from the TypeScript Version

  • PHP-specific normalization:
    • DateTimeInterface → ISO string
    • Traversable, JsonSerializable, Stringable, and Enums → normalized as described
    • Objects → arrays/associative arrays
  • No BigInt handling (not applicable to PHP)
  • Arrays are identified as:
    • List arrays (0..n-1) → arrays
    • Associative arrays → objects

Credits

License

MIT