If your app needs the LLM's output to be machine-readable, do not parse freeform prose with regex. Modern APIs can guarantee the shape of the output against a schema. Here's how to do it reliably.
The Old, Fragile Way
Asking 'respond in JSON' and parsing the result works 95% of the time — and that 5% (a stray markdown fence, a trailing comment) crashes production at 2am. Don't rely on it.
Use Structured Outputs / Tool Schemas
Provide a JSON schema and let the API constrain generation to it. Define the shape with a validation library like Zod and pass it through:
import { z } from "zod";
const Invoice = z.object({
vendor: z.string(),
total: z.number(),
dueDate: z.string(),
lineItems: z.array(z.object({ name: z.string(), amount: z.number() })),
});
// Constrain the model to this exact shape, then validate on receipt.
const data = Invoice.parse(await extractWithSchema(text, Invoice));Always Validate on Receipt
Even with schema enforcement, validate the parsed object in your code (Zod's parse). It catches edge cases and gives you typed, trustworthy data downstream.
Tool Calling Is Structured Output
When an LLM 'calls a tool', it's producing arguments that match your input schema — the same mechanism. Use tools for actions, structured outputs for data extraction.
