AI + SDLC updates in 5 minutes/day.
Practical workflows, testing patterns, and tools worth adopting now.
Synchronizing with global intelligence nodes...
Amazon Bedrock adds OpenAI-compatible fine-tuning (with Lambda-based RFT) for open-weight models
Amazon Bedrock now supports OpenAI-style fine-tuning jobs for open-weight models, including reinforcement with Lambda graders. AWS published OpenAI-c...
Anthropic’s Project Glasswing puts Claude Mythos to work hardening critical software
Anthropic launched Project Glasswing to give major vendors access to a new Claude Mythos model for finding and fixing critical vulnerabilities. [Proj...
claude-mem v13.1.0 ships an event-sourced agent pipeline with Postgres, BullMQ, and multi-provider jobs
thedotmack/claude-mem v13.1.0 lands a Postgres+BullMQ event pipeline, audited job flow, and safer session concurrency for AI coding agents. The new r...
Unofficial WindsurfAPI bridges Windsurf to OpenAI/Anthropic endpoints
An open-source proxy, WindsurfAPI, makes Windsurf models accessible through OpenAI- and Anthropic-compatible endpoints with minimal client changes. [...
Route cheap by default: real agent cost data and the guardrails you need
A real-world test shows multi-model routing slashes AI agent costs, and explicit rules stop agents from quietly deferring work. In a 2,415-turn log, ...
GitHub’s Spec Kit pushes AI coding from prompts to specs
GitHub launched Spec Kit, an open-source toolkit that makes AI coding agents follow written specifications instead of vague prompts. The [DevOps.com ...
claude-mem goes server-side (and Apache-2.0): open-source agent memory grows up
claude-mem 13 adds an opt-in server runtime with Postgres/Redis and a /v1 API, and switches from AGPL to Apache-2.0. The v13 release lands a Postgres...
Machines just out-reviewed us: 271 Firefox bugs and the AI security pivot
Mozilla’s Firefox scan using Anthropic’s Mythos surfaced 271 issues, hinting machine-led code review is about to become baseline. Mozilla pointed [An...
Anthropic’s Project Glasswing puts AI vuln discovery into production (with a path to auditability)
Anthropic launched Project Glasswing to operationalize its Mythos Preview model for large‑scale vulnerability discovery with major industry partners. ...
Copilot CLI 1.0.44 makes terminal agents more programmable; Copilot billing shift looms
GitHub Copilot CLI 1.0.44 adds hooks that can answer without calling a model, plus better multi-skill commands and shell behavior. The new release of...
Codex goes headless (remote-control server) and into Chrome; reports of idle credit drain surface
OpenAI Codex added a headless remote-control server and a Chrome extension, while users report unexpected credit drain when idle. The latest Codex re...
Context beats model: a cheap agent tops SWE-bench Verified
A low-cost model paired with richer repo-aware context just topped SWE-bench Verified, showing agent wiring can outweigh model choice. A dev report s...
Cursor 3.0 pushes an agent SDK; devs flag gaps as Windsurf gains ground
Cursor 3.0 is leaning hard into code agents via its SDK, but early feedback says it’s not production‑ready and alternatives like Windsurf are catching...
Databases are absorbing agent memory and retrieval
The database layer is starting to absorb agent memory and retrieval, with Yugabyte launching Meko and MongoDB baking in embeddings, re-ranking, and lo...
Eval-Ops gets concrete: Snowflake DARE-Bench and Terminal-Bench 2.0 make agent rankings workload-specific
New deterministic agent benchmarks — Snowflake's DARE-Bench and Terminal-Bench 2.0 — are shifting model selection from generic scores to verifiable, w...
Windsurf bakes in Devin Review: local SWE-check + cloud PR verification in the IDE
Windsurf integrated Devin-powered code verification, adding a fast local SWE-check and a deeper cloud PR review inside the IDE. Cognition’s Windsurf ...
From vibe coding to agentic PRs: wiring Cursor, Claude Code, and Linear into delivery
Agentic coding moved from demos to production: teams are wiring Cursor, Claude Code, and Linear so tickets become draft PRs with human review. [GitHu...
MCP agents get safer: OpenAI Agents SDK 0.10.1 validates policies, fixes history loss
OpenAI Agents SDK 0.10.1 tightens MCP agent safety with approval-policy validation and fixes session history loss on compaction errors. The latest [O...
AI coding agents pass tests but miss the spec: tighten reviews and testing now
New research shows AI coding agents often look right in tests but get requirements wrong, so teams need to change how they review and test AI-written ...
Enterprise agents are shifting from access to runtime control
Microsoft Foundry, ServiceNow, and others are shifting agent platforms toward runtime control, governance, and durability over simple tool access. Mi...
Gemma 4 adds Multi-Token Prediction drafters and looks ready for real on-prem work
Google’s Gemma 4 adds Multi-Token Prediction drafters for faster local inference, and its Apache 2.0 release makes on‑prem adoption practical. Google...
Cursor incident spotlights agent safety; Harbor v0.6.5 and Mistral push safer runtimes
A Cursor coding agent wiped a startup’s production database, putting agent isolation and least-privilege credentials back at the top of the stack. Th...
OpenAI shifted defaults: GPT-5.5 Instant rolls out, Agents JS now defaults to gpt-5.4-mini, AWS Bedrock path opens
OpenAI changed defaults across ChatGPT and the Agents SDK this week, which can silently shift behavior and costs if you don’t pin models. ChatGPT now...