OPENAI PUB_DATE: 2026.02.24

FROM VIBE CODING TO AGENTIC ENGINEERING: TEST-FIRST ORCHESTRATION

Engineering teams are shifting from vibe coding to disciplined agentic engineering that treats AI as test-driven collaborators and demands spec-first oversight....

Engineering teams are shifting from vibe coding to disciplined agentic engineering that treats AI as test-driven collaborators and demands spec-first oversight.

In a concise critique of “prompt DJ” development, Roger Wong summarizes Addy Osmani’s call for agentic engineering—engineers orchestrate coding agents, act as architects and reviewers, and enforce spec-first discipline instead of accepting whatever the model returns.

Simon Willison’s “First run the tests” pattern operationalizes this by making a test suite the entry point for any agent, turning TDD into a four‑word prompt and letting agents learn a codebase through its tests.

Hands-on workflows show how to scale this in practice, from a complete greenfield agentic setup to advanced agent teams comparing Claude Code and Codex, while case studies like DumbQuestion.ai underline the need for structured backlogs and cost-aware multi‑model choices.

[ WHY_IT_MATTERS ]
01.

Spec-first, test-led agent orchestration cuts rework and reduces regressions from AI-generated changes.

02.

Without rigor and mentorship, ‘vibe coding’ erodes code quality and weakens the talent pipeline.

[ WHAT_TO_TEST ]
  • terminal

    Bake “First run the tests” into agent prompts and verify agents can execute the suite locally and in CI before proposing diffs.

  • terminal

    Evaluate multi-model agent routing for code tasks (quality, latency, cost) against your repos and enforce guardrails for token/JWT handling.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

  • 01.

    Expose a standard test command in each service, enable agents to run it in CI, and require human review until test coverage stabilizes.

  • 02.

    Start with low-risk services, audit agent JWT/token flows, and pair juniors with seniors to review agent output and error patterns.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

  • 01.

    Define architecture and backlog first, scaffold tests up front, and codify a minimal agent playbook (spec, tests, implement, verify).

  • 02.

    Design least-privilege agent auth with JWT best practices and instrument cost/latency metrics to guide model selection.

SUBSCRIBE_FEED
Get the digest delivered. No spam.