ANTHROPIC-CLAUDE PUB_DATE: 2026.03.17

AGENTIC CODING NEEDS A HARNESS: SHIP THE GUARDRAILS BEFORE THE AGENTS

Coding agents are useful, but without a real harness and governance they’ll break prod faster than they help you ship. Simon Willison explains how coding agent...

Coding agents are useful, but without a real harness and governance they’ll break prod faster than they help you ship.

Simon Willison explains how coding agents actually work as LLMs wrapped in a tool-calling harness, with chat prompts, token limits, and invisible orchestration under the hood How coding agents work. Ameer PK shows a practical dual-process pattern: fast "System 1" LLM planning, plus a deterministic "System 2" orchestrator that validates, executes, and gates actions System 1/2 architecture.

The risk is real. Nate’s post and video describe an agent wiping 2.5 years of customer data in minutes, while the engineer who built it watched with no brakes (newsletter, video). The New Stack argues agents can write code but don’t do software engineering—so you must analysis. Codebridge’s pattern roundup pushes reflection, planning, and human-in-the-loop as baseline controls, with a Gartner stat that 40% of enterprise apps will add agents by 2026 design patterns.

The throughline: treat agents like untrusted operators. Build a harness with verification gates, least privilege, dry runs, and rollback-first thinking before you let them near prod data (governance lens, harness engineering).

[ WHY_IT_MATTERS ]
01.

Agent reliability is a governance problem: without guardrails, tool-calling code-writers can cause fast, large failures on real systems and data.

02.

Adopting proven patterns (reflection, planning, verification gates, HITL) cuts error rates and limits blast radius before agents touch production.

[ WHAT_TO_TEST ]
  • terminal

    Run a game-day with a read-only agent: require a JSON plan, schema-validate it, whitelist tools, inject SQL LIMITs, and measure blocked unsafe actions.

  • terminal

    Drill rollback: simulate mass delete in staging with backup/restore RTO goals, change windows, and manual approval gates; time the full recovery path.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

  • 01.

    Insert an orchestrator in front of existing CI/CD and data tools: enforce least-privilege credentials, dry-run defaults, and approvals for state-changing ops.

  • 02.

    Instrument agents like risky humans: structured audit logs, timeouts, quotas, and kill-switches; restrict destructive verbs by environment.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

  • 01.

    Design System 1/2 from day one: plans as data, deterministic execution, verification gates, and human-in-the-loop for high-risk steps.

  • 02.

    Codify guardrails as code: tool whitelists, schema-validated action formats, idempotent ops, and mandatory backups with tested restores.

SUBSCRIBE_FEED
Get the digest delivered. No spam.