Pick a schema
Choose the document type — the target fields are fixed up front so output is predictable.
Turn messy, unstructured text into clean structured data. Pick a document type, paste (or edit) the text, and the LLM extracts the fields into JSON — grammar-constrained so the output is always valid, never a broken half-response.
Document type
Runs a local LLM on CPU — expect 8–20 s. Output is grammar-constrained to valid JSON.
Extracted JSON
Fields () will appear here as clean JSON.
Pick a schema
Choose the document type — the target fields are fixed up front so output is predictable.
Prompt
The model is asked to extract exactly those fields, using null for anything missing.
Grammar constraint
A GBNF grammar forces the model to emit valid JSON — no markdown, no trailing prose.
Coerce to schema
The result is mapped onto the fixed field set so the UI is always stable.
Local model
Qwen2.5-1.5B on CPU. Small but reliable for bounded extraction tasks like this.
Production swap
Add confidence scoring, validation rules, and Claude/GPT for tougher documents — same contract.
We build production extraction for contracts, invoices, claims, and forms — with confidence scores, human-in-the-loop review for low-confidence fields, and validation against your systems of record.
Book a 30-minute consultation. We will walk through the use case, sketch the value case, and tell you honestly whether we can help.