Reliable Structured Output From LLMs (JSON, Tools, Schemas)

If your app needs the LLM's output to be machine-readable, do not parse freeform prose with regex. Modern APIs can guarantee the shape of the output against a schema. Here's how to do it reliably.

The Old, Fragile Way

Asking 'respond in JSON' and parsing the result works 95% of the time — and that 5% (a stray markdown fence, a trailing comment) crashes production at 2am. Don't rely on it.

Use Structured Outputs / Tool Schemas

Provide a JSON schema and let the API constrain generation to it. Define the shape with a validation library like Zod and pass it through:

import { z } from "zod";

const Invoice = z.object({
  vendor: z.string(),
  total: z.number(),
  dueDate: z.string(),
  lineItems: z.array(z.object({ name: z.string(), amount: z.number() })),
});

// Constrain the model to this exact shape, then validate on receipt.
const data = Invoice.parse(await extractWithSchema(text, Invoice));

Always Validate on Receipt

Even with schema enforcement, validate the parsed object in your code (Zod's parse). It catches edge cases and gives you typed, trustworthy data downstream.

Tool Calling Is Structured Output

When an LLM 'calls a tool', it's producing arguments that match your input schema — the same mechanism. Use tools for actions, structured outputs for data extraction.

Reliable Structured Output From LLMs (JSON, Tools, Schemas)

The Old, Fragile Way

Use Structured Outputs / Tool Schemas

Always Validate on Receipt

Keep Reading

How to Build a RAG Application (Retrieval-Augmented Generation)

Vector Databases Explained: pgvector, Pinecone and Embeddings

Getting Started With the Claude API for Developers

Ready to implement these ideas?