Synchronizing with global intelligence nodes...
OpenRouter’s coding leaderboard: free Qwen 3.6 Plus tops usage with 1M context and strong repo‑level skills
OpenRouter’s latest usage data shows Qwen 3.6 Plus (free) leading coding workloads, with big context, solid reasoning, and zero-cost tokens. OpenRout...
Agentic coding hits the reliability phase: this week’s updates focus on state, ops, and safety
Multiple agentic coding stacks shipped reliability-first updates, signaling a shift from model flash to harness quality, state handling, and operator ...
OpenAI Codex shifts to per-task compute-unit pricing; plan for quotas, rate limits, and ops
OpenAI’s Codex coding agent now charges per task in compute units, changing how teams budget and operate AI-assisted development. OpenAI’s newly surf...
Claude-mem v11.0.1 makes semantic memory injection opt-in to cut latency and context noise
The claude-mem tool now disables semantic memory injection by default to reduce latency and irrelevant context during prompts. Per the v11.0.1 releas...
Anthropic accidentally leaks Claude Code source: treat this as a supply-chain wake‑up call
Anthropic accidentally exposed Claude Code’s full source repo, raising security questions and giving outsiders an unprecedented look at a major AI cod...
Open agents grow up: Gemma 4, Qwen 3.6 Plus, and a cost-savvy runtime pattern you can use now
Open-source-grade agents just got more practical with Gemma 4, Qwen 3.6 Plus, and a cost‑savvy agent runtime update. Google’s new Gemma 4 brings Apac...
IDE agents get serious: Cursor 3’s multi-env Agents Window and Copilot Chat’s prompt-cache tweaks
AI agents in IDEs level up: Cursor 3 adds parallel multi-environment agents while VS Code Copilot Chat tweaks for prompt-cache efficiency; some users ...
Anthropic leak exposes ‘Claude Mythos’, Claude Code internals, and a clampdown on third‑party harness usage
Anthropic’s internal docs and pieces of Claude Code leaked, revealing ‘Claude Mythos’ plans and new charges for third‑party tool usage like OpenClaw. ...
Claude Code 2.1.92 ships fail-closed policy, AWS Bedrock setup wizard, and clearer cost telemetry; Anthropic details a three-agent harness for long-running work
Anthropic updated Claude Code with stronger governance, easier AWS Bedrock setup, and better cost visibility, while sharing a concrete pattern for lon...
Python + Claude pipeline that drafts, scores, and Telegrams Upwork proposals hit 31% response
An engineer built a Python system that drafts and scores Upwork proposals with Claude, then sends top picks to Telegram, landing a 31% response rate. ...
Bulk Major-Version Upgrades Without the Pain: A Look at Kiro CLI
A community write-up suggests Kiro CLI can make bulk major-version dependency upgrades across many services practical instead of painful. The post co...
Train bigger models on fixed GPUs: a pragmatic memory trick and an architecture refresher
Two tutorials explain ways to train larger models with limited GPU memory, while a debate piece pushes for generalist scientific AI. A practical post...
AI agents just tipped code security from noisy to useful — maintainers report a surge of real bugs
AI-driven agents are now producing high-quality vulnerability reports at scale, shifting security triage from AI slop to real issues. Multiple vetera...
Enterprise AI agents are moving from demos to governed pilots
Agentic AI in enterprises is shifting from hype to governed pilots focused on interoperability, data access, and measurable outcomes. Recent pieces a...
OpenClaw patches admin-takeover bug; treat agent platforms like exposed control planes
OpenClaw fixed critical privilege-escalation flaws, underscoring how agent platforms magnify risk when wired into real enterprise systems. Earlier th...
Anthropic ships Claude for Cowork; research shows steerable 'emotion' circuits; IP filters tighten
Anthropic launched enterprise AI agents while publishing research on steerable emotion-like circuits in Claude and tightening IP filtering policies.
MCP-powered coding agents hit real tooling (Chrome DevTools, ABL in Windsurf) as typosquatting targets IDEs
MCP-based coding agents are moving into serious dev workflows while IDE extension typosquatting raises fresh supply chain risk. Google’s open-source ...
Cursor 3 introduces an agent-first IDE with a unified Agents Window
Cursor 3 launches with an agent-first interface that centralizes how you run coding agents across repos and environments. The new Agents Window is do...
Gemini API adds Flex and Priority inference tiers; OSS client ships circuit breaker for Gemini 503s
Google introduced Flex and Priority inference tiers for the Gemini API to trade cost for reliability, and an OSS client added circuit breakers for Gem...
No, GPT-5.4 didn’t drop; focus on hardening OpenAI integrations as ChatGPT Apps recommendations hiccup
Ignore viral GPT-5.4 claims and shore up your OpenAI integrations; some developers report ChatGPT Apps recommendations aren’t working.
Choosing the right frontier model by workflow: compliance, agents, and file-heavy work
Model choice now hinges on whether you need strict instruction compliance, agent-style execution, or heavy file/long-document work. A head-to-head on...
SWE-Bench Pro leaderboard: small gains at the top, big contexts, and mostly self-reported results
A new SWE-Bench Pro leaderboard shows top code models clustered around 0.55–0.58, with large contexts and self-reported scores. The updated [SWE-Benc...
Datasette-llm 0.1a6: simpler model config, better Python API docs
Datasette-llm 0.1a6 removes duplicate model config and improves Python API docs for plugin authors. The release makes a small but handy change: setti...