OpenAI previews GPT-5.6 (Sol/Terra/Luna)…

OPENAI PUB_DATE: 2026.06.27

OPENAI PREVIEWS GPT-5.6 (SOL/TERRA/LUNA) WITH NEW PRICING AND CACHE SEMANTICS UNDER LIMITED ROLLOUT

OpenAI previewed the GPT-5.6 model family (Sol, Terra, Luna) with new pricing and stricter prompt-caching rules in a limited U.S.-only rollout. Per OpenAI’s ow...

OpenAI previewed the GPT-5.6 model family (Sol, Terra, Luna) with new pricing and stricter prompt-caching rules in a limited U.S.-only rollout.

Per OpenAI’s own quote via Simon Willison, Sol, Terra, and Luna are in limited preview, with general availability planned in the coming weeks, and per‑1M‑token pricing set at $5/$30, $2.50/$15, and $1/$6 input/output respectively OpenAI quote via Simon Willison. The same source notes more predictable prompt caching: explicit cache breakpoints, a 30‑minute minimum cache life, cache writes billed at 1.25x input, and 90% discount on cache reads.

OpenAI’s community post confirms the new series and rollout posture OpenAI community post. A third‑party report adds claims about “maximum reasoning” and an “Ultra” subagent mode, pending official docs Interesting Engineering. If you’re planning deep, long‑context workflows, the GPT‑5.5 API explainer remains a useful guide to configuring effort, structure, and costs GPT‑5.5 API explainer.

[ WHY_IT_MATTERS ]

01.

Pricing and cache billing changed, which affects cost models, routing, and how you design long-context pipelines.

02.

Tiered models enable cheaper defaults with on-demand upgrades, but rollout is gated and may delay production adoption.

[ WHAT_TO_TEST ]

terminal
Model routing: default to Luna or Terra and escalate to Sol for hard cases; compare quality, latency, and total cost per job.
terminal
Prompt caching economics: use explicit cache breakpoints and measure cache-write cost (1.25x input) vs 90% read discount on real workloads.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

01.
Update cost alerts and budgets for new per-1M pricing and cache-write billing; re-run long-context job forecasts.
02.
Introduce a traffic splitter that routes by task complexity; keep GPT‑5.5 as fallback until GPT‑5.6 broad GA.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

01.
Design a tiered inference plan from day one: Luna/Terra for routine steps, Sol only where ROI is proven.
02.
Build prompts with deterministic schemas and cache breakpoints to reuse expensive context across steps.

Enjoying_this_story?

Get daily OPENAI + SDLC updates.

Practical tactics you can ship tomorrow
Tooling, workflows, and architecture notes
One short email each weekday

arrow_back

PREVIOUS_DATA_LOG

Sealing the leaks in coding-agent evals: Cursor shows SWE-bench Pro scores are being gamed

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

From chat to delegation: Codex data shows agents are becoming workflows, not answers

arrow_forward