JDF documents written in JSON.

Beautiful pages. Editable in any text editor. Diffable in git. Generated with one line of code. Searchable with grep. Validatable with a JSON Schema. Embeddable on any web page. Read more in less time: one file, every tool already speaks it.

View on GitHub

Linux, Windows, and Intel Macs: grab a build from the latest release.

JDF Reader — rendered document view
Rendered view — page-perfect layout, edit in place.

Why JDF

It's JSON. Every consequence below falls out of that.

AI

LLMs read it natively

JSON is the language LLMs already speak. No PDF parsing, no broken columns, no lost tables. Hand a .jdf to GPT or Claude and it sees the structure exactly as you wrote it. More →

$

Every tool already works

cat, grep, jq, VS Code, every linter, every diff tool. No plugin, no learning curve.

git

git diff is real

Change one heading, the diff is one line. Documents review the same way as code.

{ }

Generate with one line

JSON.stringify(doc). No PDF library, no font dictionaries, no encoding rules.

JSON Schema validates

Autocomplete in your editor. CLI validation in CI. Type-safe generation.

Search is text search

grep "TODO" *.jdf works. jq queries pull every table cell on the page.

No vendor, no lock-in

Opens the same way today and in 20 years. JSON has no proprietary owner.

RAG / AI ingestion — why JSON beats PDF

A retrieval pipeline that ingests JDF skips most of the work it would do on a PDF. The structure that PDF parsers try to reconstruct is already in the file.

Pipeline stagePDFJDF
Parse / extract pdfplumber / pymupdf / unstructured — layout analysis, font heuristics, OCR fallback for image-only pages JSON.parse(content) — no layout reconstruction, the structure is already in the file
Chunking Token-windowed splits that frequently slice through tables, lists, footnotes Each element (text / richtext / table / list / image) is a natural retrieval unit — no chunker config
Metadata Synthesised after the fact (page number, "is this a heading?") and often wrong First-class on every element: type, heading, page index, position, link target
Embedding noise Repeated page headers / footers / page numbers leak into chunks header and footer live in their own tree, never in content chunks
Re-indexing on edit Re-parse + re-chunk + re-embed the whole PDF Diff the JSON, re-embed only the changed elements
Tables Cells smear across columns; multi-row headers collapse { headers: [...], rows: [[...]] } — every cell at its real coordinate
Images / figures Dropped or stubbed as [image] Stored in resources.images with alt text and a stable anchor — a vision step can fetch it at the exact retrieval point
Benchmarks coming. The wins above are structural — pipeline stages JDF removes entirely — not measured timings. We're running benchmarks on a public corpus (academic PDFs, financial filings, scanned reports) and will publish parse / chunk / embed / retrieval numbers here as soon as they're ready. If you run a comparison on your own corpus, please open an issue — we'd like to include it.

What a JDF file looks like

{
  "$jdf": "1.0.0",
  "meta": { "title": "Hello", "pageSize": "A4" },
  "styles": {
    "h1": { "fontSize": 22, "fontWeight": "bold" }
  },
  "pages": [{
    "elements": [
      { "type": "text", "content": "Hello, JDF",
        "heading": 1, "style": "h1",
        "position": { "x": 0, "y": 5 },
        "width": 166 },
      { "type": "list", "listType": "unordered",
        "items": [
          { "content": "Just JSON" },
          { "content": "Diffable" },
          { "content": "Editable anywhere" }
        ],
        "position": { "x": 0, "y": 25 }, "width": 166 }
    ]
  }]
}

Hello, JDF

  • Just JSON
  • Diffable
  • Editable anywhere
→ rendered

Features

Edit in place

Double-click any paragraph, heading, list item, table cell, image — inline editor opens for that element only. Enter commits. Auto-saves to disk in 150ms.

Restructure on hover

Mouse over any element → floating bar appears with ↑ Move up · ↓ Move down · ⧉ Duplicate · × Delete. No right-click, no menu hunting.

Insert anything

Insert bar at the top of every page. Click to add: text, rich text, list, table, shape, image, collapsible section, auto-generated TOC.

Live JSON view

Toggle View ↔ JSON. Two-way bound. Edit JSON directly with Cmd+S — visual render follows. Edit visually — JSON updates live.

PDF import — full fidelity

Drag a PDF on the viewer. Every text run keeps its position, font, weight, color, opacity, link. Embedded images extracted. Vector shapes (rect/line/path) preserved with fills and strokes. Looks identical to the original — and it's editable.

PDF export

Round-trip back to .pdf. Honors page size, orientation, text colors, real TOC, embedded images. Renders text/list/table/collapsible/shape.

Markdown viewer

Open .md for a continuous-scroll, GitHub-style render with full GFM. Toggle to paged JDF view. Cmd+F highlights matches inline.

Undo / redo · multi-window

⌘Z · ⌘⇧Z — 100-step history covering every text edit, structural change, and JSON commit. ⌘N for a new window — compare two documents side-by-side.

JSON Schema + CLI

Draft-07 schema. jdf validate file.jdf reports path-level errors with Ajv. CI on three OSes (macOS, Linux, Windows) on every PR.

PDF vs JDF

PDF
JDF
Format
Binary blob
Human-readable JSON
Open with
Adobe Reader (~250 MB)
Any text editor or JDF Reader (~14 MB)
Edit a typo
Acrobat or Word roundtrip
Double-click → type → done
Auto-save while editing
✓ debounced, with status indicator
Raw view
Live two-way JSON view
git diff
Meaningless binary
Clean, line-level
Generate from code
reportlab, pdfkit, weeks
JSON.stringify(doc)
Schema validation
None
JSON Schema + Ajv CLI
Search programmatically
Custom binary parsing
grep, jq, ripgrep
Vendor lock-in
Adobe
None — it's JSON

Also runs on the web — jdf.js

A small JavaScript library that turns any .jdf URL into a fully styled, embedded viewer in your web page. Drop in one tag and you're done.

<link rel="stylesheet" href="https://unpkg.com/@uurtech/jdf@0.1.11/dist/jdfjs.css">
<script type="module" src="https://unpkg.com/@uurtech/jdf@0.1.11"></script>

<jdf src="examples/article.jdf"></jdf>
↓ The tag above, live on this page

Embed docs → Full documentation →

Format

Every JDF document has these top-level fields:

FieldRequiredWhat it does
$jdfyesFormat version (semver)
metayestitle, author, page size, margins, language
stylesnoReusable named style definitions
resourcesnoEmbedded fonts and base64 images
header · footernoRepeating header/footer with template vars or full element trees
pagesyesArray of pages, each with its own elements

Install

Desktop · macOS

brew tap uurtech/jdf
brew install jdf

Upgrade later with brew upgrade --cask jdf.

Web · jdf.js

npm install @uurtech/jdf

Or via CDN. See embed docs →

Linux / Windows

Grab a build from the latest release.deb, .AppImage, .rpm, .msi, .exe all built by GitHub Actions on every tag.

From source

git clone https://github.com/uurtech/jdf.git
cd jdf
pnpm install
pnpm tauri build

Requires Node 20+, pnpm 9+, Rust stable.