JDF documents written in JSON.

Beautiful pages. Editable in any text editor. Diffable in git. Generated with one line of code. Searchable with grep. Validatable with a JSON Schema. Embeddable on any web page. Read more in less time: one file, every tool already speaks it.

⌘

Desktop reader

Native macOS app · brew tap uurtech/jdf

</>

jdf.js — web embed

25 kB JS library · <jdf src="...">

CLI

Validate, convert from Markdown · npx @uurtech/jdf-cli

View on GitHub

Linux, Windows, and Intel Macs: grab a build from the latest release.

JDF Reader — rendered document view — Rendered view — page-perfect layout, edit in place.

Why JDF

It's JSON. Every consequence below falls out of that.

LLMs read it natively

JSON is the language LLMs already speak. No PDF parsing, no broken columns, no lost tables. Hand a .jdf to GPT or Claude and it sees the structure exactly as you wrote it. More →

Every tool already works

cat, grep, jq, VS Code, every linter, every diff tool. No plugin, no learning curve.

git

`git diff` is real

Change one heading, the diff is one line. Documents review the same way as code.

{ }

Generate with one line

JSON.stringify(doc). No PDF library, no font dictionaries, no encoding rules.

✓

JSON Schema validates

Autocomplete in your editor. CLI validation in CI. Type-safe generation.

⌕

Search is text search

grep "TODO" *.jdf works. jq queries pull every table cell on the page.

∞

No vendor, no lock-in

Opens the same way today and in 20 years. JSON has no proprietary owner.

RAG / AI ingestion — why JSON beats PDF

A retrieval pipeline that ingests JDF skips most of the work it would do on a PDF. The structure that PDF parsers try to reconstruct is already in the file.

Pipeline stage	PDF	JDF
Parse / extract	`pdfplumber` / `pymupdf` / `unstructured` — layout analysis, font heuristics, OCR fallback for image-only pages	`JSON.parse(content)` — no layout reconstruction, the structure is already in the file
Chunking	Token-windowed splits that frequently slice through tables, lists, footnotes	Each element (`text` / `richtext` / `table` / `list` / `image`) is a natural retrieval unit — no chunker config
Metadata	Synthesised after the fact (page number, "is this a heading?") and often wrong	First-class on every element: `type`, `heading`, page index, position, link target
Embedding noise	Repeated page headers / footers / page numbers leak into chunks	`header` and `footer` live in their own tree, never in content chunks
Re-indexing on edit	Re-parse + re-chunk + re-embed the whole PDF	Diff the JSON, re-embed only the changed elements
Tables	Cells smear across columns; multi-row headers collapse	`{ headers: [...], rows: [[...]] }` — every cell at its real coordinate
Images / figures	Dropped or stubbed as `[image]`	Stored in `resources.images` with alt text and a stable anchor — a vision step can fetch it at the exact retrieval point

Benchmarks coming. The wins above are structural — pipeline stages JDF removes entirely — not measured timings. We're running benchmarks on a public corpus (academic PDFs, financial filings, scanned reports) and will publish parse / chunk / embed / retrieval numbers here as soon as they're ready. If you run a comparison on your own corpus, please open an issue — we'd like to include it.

Long-form: why JDF is AI-friendly →

What a JDF file looks like

{
  "$jdf": "1.0.0",
  "meta": { "title": "Hello", "pageSize": "A4" },
  "styles": {
    "h1": { "fontSize": 22, "fontWeight": "bold" }
  },
  "pages": [{
    "elements": [
      { "type": "text", "content": "Hello, JDF",
        "heading": 1, "style": "h1",
        "position": { "x": 0, "y": 5 },
        "width": 166 },
      { "type": "list", "listType": "unordered",
        "items": [
          { "content": "Just JSON" },
          { "content": "Diffable" },
          { "content": "Editable anywhere" }
        ],
        "position": { "x": 0, "y": 25 }, "width": 166 }
    ]
  }]
}

Hello, JDF

Just JSON
Diffable
Editable anywhere

→ rendered

Features

Edit in place

Double-click any paragraph, heading, list item, table cell, image — inline editor opens for that element only. Enter commits. Auto-saves to disk in 150ms.

Restructure on hover

Mouse over any element → floating bar appears with ↑ Move up · ↓ Move down · ⧉ Duplicate · × Delete. No right-click, no menu hunting.

Insert anything

Insert bar at the top of every page. Click to add: text, rich text, list, table, shape, image, collapsible section, auto-generated TOC.

Live JSON view

Toggle View ↔ JSON. Two-way bound. Edit JSON directly with Cmd+S — visual render follows. Edit visually — JSON updates live.

PDF import — full fidelity

Drag a PDF on the viewer. Every text run keeps its position, font, weight, color, opacity, link. Embedded images extracted. Vector shapes (rect/line/path) preserved with fills and strokes. Looks identical to the original — and it's editable.

PDF export

Round-trip back to .pdf. Honors page size, orientation, text colors, real TOC, embedded images. Renders text/list/table/collapsible/shape.

Markdown viewer

Open .md for a continuous-scroll, GitHub-style render with full GFM. Toggle to paged JDF view. Cmd+F highlights matches inline.

Undo / redo · multi-window

⌘Z · ⌘⇧Z — 100-step history covering every text edit, structural change, and JSON commit. ⌘N for a new window — compare two documents side-by-side.

JSON Schema + CLI

Draft-07 schema. jdf validate file.jdf reports path-level errors with Ajv. CI on three OSes (macOS, Linux, Windows) on every PR.

PDF vs JDF

PDF

JDF

Format

Binary blob

Human-readable JSON

Open with

Adobe Reader (~250 MB)

Any text editor or JDF Reader (~14 MB)

Edit a typo

Acrobat or Word roundtrip

Double-click → type → done

Auto-save while editing

—

✓ debounced, with status indicator

Raw view

—

Live two-way JSON view

git diff

Meaningless binary

Clean, line-level

Generate from code

reportlab, pdfkit, weeks

JSON.stringify(doc)

Schema validation

None

JSON Schema + Ajv CLI

Search programmatically

Custom binary parsing

grep, jq, ripgrep

Vendor lock-in

Adobe

None — it's JSON

Also runs on the web — `jdf.js`

A small JavaScript library that turns any .jdf URL into a fully styled, embedded viewer in your web page. Drop in one tag and you're done.

<link rel="stylesheet" href="https://unpkg.com/@uurtech/jdf@0.1.11/dist/jdfjs.css">
<script type="module" src="https://unpkg.com/@uurtech/jdf@0.1.11"></script>

<jdf src="examples/article.jdf"></jdf>

↓ The tag above, live on this page

Embed docs → Full documentation →

Format

Every JDF document has these top-level fields:

Field	Required	What it does
`$jdf`	yes	Format version (semver)
`meta`	yes	title, author, page size, margins, language
`styles`	no	Reusable named style definitions
`resources`	no	Embedded fonts and base64 images
`header` · `footer`	no	Repeating header/footer with template vars or full element trees
`pages`	yes	Array of pages, each with its own elements

Element types: text (with heading 1-6), richtext, image, table, list, shape, collapsible, toc. Full schema: spec/jdf-schema.json · Working example: hello-world.jdf.

Install

Desktop · macOS

brew tap uurtech/jdf
brew install jdf

Upgrade later with brew upgrade --cask jdf.

Web · jdf.js

npm install @uurtech/jdf

Or via CDN. See embed docs →

Linux / Windows

Grab a build from the latest release — .deb, .AppImage, .rpm, .msi, .exe all built by GitHub Actions on every tag.

From source

git clone https://github.com/uurtech/jdf.git
cd jdf
pnpm install
pnpm tauri build

Requires Node 20+, pnpm 9+, Rust stable.

JDF documents written in JSON.

Why JDF

LLMs read it natively

Every tool already works

git diff is real

Generate with one line

JSON Schema validates

Search is text search

No vendor, no lock-in

RAG / AI ingestion — why JSON beats PDF

What a JDF file looks like

Hello, JDF

Features

Edit in place

Restructure on hover

Insert anything

Live JSON view

PDF import — full fidelity

PDF export

Markdown viewer

Undo / redo · multi-window

JSON Schema + CLI

PDF vs JDF

Also runs on the web — jdf.js

Format

Install

Desktop · macOS

Web · jdf.js

Linux / Windows

From source

`git diff` is real

Also runs on the web — `jdf.js`