AGENTS
30 days · UTC
Synchronizing with global intelligence nodes...
Anthropic launches Project Glasswing, giving controlled access to Claude Mythos for vulnerability discovery
Anthropic formed Project Glasswing and is withholding its Claude Mythos Preview model for controlled, defensive use after it found thousands of high‑s...
OpenAI Python v2.31.0: short‑lived tokens and raw WebSocket streaming land amid logging glitches
OpenAI’s Python SDK v2.31.0 adds short-lived token auth and raw WebSocket streaming, while developers report dashboard logging glitches. The new rele...
Codex adds Hooks docs, community sees better limits after April 1 reset, and GPT-5.4 stop behavior raises questions
OpenAI’s Codex platform quietly added Hooks docs while developers report improved limits and flag possible GPT-5.4 stop handling changes. OpenAI publ...
Codex 0.117.0: first-class plugins, cleaner multi-agent addressing, and steadier TUI; watch performance on large workspaces
OpenAI Codex 0.117.0 ships first-class plugins and multi-agent v2 improvements, while a community report flags heavy UI lag on large file sets. The [...
OpenAI’s platform shake-up: Sora API shutdown reported, SDK tweaks, and agent reliability gaps
OpenAI’s surface area is shifting: Sora APIs are reportedly shutting down while SDK changes and developer issues highlight integration risk. Neowin r...
Agents JS v0.8.0 ships realtime default upgrade; pair it with prompt caching and stricter schema checks
OpenAI’s agents JS library quietly upgraded realtime defaults and stabilized MCP, while new guidance and research push us to harden prompt and output ...
Ship safer LLM agents with multi-turn, regulation-aware evals
DeepEval brings multi-turn, policy-aware testing for LLM chats into reach, while practitioners converge on structured prompts over tone tweaks. A new...
Parallel AI Coding with 'Codex Subagents' as a Practical Workflow
A hands-on post shows how to orchestrate parallel AI coding workers (“subagents”) to cut feature delivery time. The piece outlines a pattern where se...
OpenAI ships GPT-5.4 mini and nano for fast coding/subagent workloads, plus Python SDK v2.29.0 support
OpenAI released GPT-5.4 mini and nano, smaller models tuned for speed and high-volume coding/subagent workflows, alongside an SDK update that adds fir...
OpenAI adds a computer environment with Shell to the Responses API, with early reliability edge cases surfacing
OpenAI introduced a built-in computer environment, including a Shell tool, to the Responses API, and early reports flag availability and file input qu...
Copilot agents get real knobs: CLI controls, VS Code debugging, and a tool catalog—watch token burn
GitHub and Microsoft shipped practical upgrades for Copilot agents across the CLI and VS Code, while users report a spike in token usage.
Agent stack gets real: Copilot CLI adds MCP controls, LangChain supports OpenAI compaction, Realtime 1.5 lands
Agent tooling just got more practical: Copilot CLI adds MCP and safety controls, LangChain supports OpenAI compaction, and OpenAI ships Realtime 1.5. ...
GPT‑5.3 Rumors vs. GPT‑5.2 Reality: Plan on What’s Confirmed
OpenAI has only publicly positioned GPT‑5.2 as its current flagship with improvements in long‑running agent workflows, tool calling, multimodality, an...
GPT-5.2 confirmed; 5.3 unconfirmed—plan for point-release readiness
OpenAI’s officially confirmed state is GPT-5.2, with upgrades across long-running agents, multimodality, tool use, and code generation; treat this as ...
Update: Auto Claude Project Manager Wrapper
A new community walkthrough video demonstrates end-to-end setup of Auto Claude, showing how to turn Claude API calls into a structured, multi-step pro...
Auto Claude: open-source wrapper that turns Claude into a lightweight project manager
Auto Claude is an open-source wrapper that runs structured, multi-step "project manager" workflows on top of Anthropic’s Claude API. It aims to move b...
Creator demos: Gemini 3 'Deep Think' for agent workflows
Two creator videos claim Gemini 3 with a 'Deep Think' mode improves multi-step reasoning and enables more capable, tool-using agents. While official d...
Update: Google DeepMind AGI roadmap and agentic systems
In a new video, Demis Hassabis lays out the clearest public roadmap to AGI yet, explicitly centering on agentic systems that plan, use tools, and work...
Update: Claude Code IDE New Features
A new creator video reiterates sub-agents, LSP integration, and a high-capacity model, and newly claims an AI-assisted terminal for CLI workflows plus...
OpenAI 'Hazelnut' Skills: composable, code-executable modules (rumored 2026)
Reports indicate OpenAI is testing 'Skills' (codename Hazelnut): reusable capability modules bundling instructions, context, examples, and executable ...
Inside AI coding agents: supervisors, tools, and sandboxed execution
Modern coding agents wrap multiple LLMs: a supervisor decomposes work and tool-using workers edit code, run commands, and verify results in loops. The...
MiniMax M2.1 lands; plan for faster agentic-model iterations
MiniMax released its M2.1 model; coverage highlights accelerating release cycles and growing focus on agentic use cases. Expect changes in tool-use be...
Engineering, not models, is now the bottleneck
A recent video argues that model capability is no longer the main constraint; the gap is in how we design agentic workflows, tool use, and evaluation ...