CI-CD
30 days · UTC
Synchronizing with global intelligence nodes...
AI-written tests and SecOps–AppSec consolidation are converging on your pipeline
VarLog’s Inspect launches while Torq acquires Jit, signaling a shift to AI-driven, end-to-end automation across QA and security pipelines. VarLog’s n...
Ship safer AI faster: put governance in CI/CD and run a model-upgrade audit
Treat AI governance like tests in your pipeline and audit your stack before swapping to a stronger model. Modern teams are baking bias checks, explai...
Copilot goes agent-first: CLI gets CI-friendly MCP auth, Studio ships multi‑agent GA
GitHub is tightening its agent tooling: Copilot CLI adds CI-friendly MCP auth and persistent config, while Copilot Studio’s multi-agent orchestration ...
Agentic QE v3.8.10 replaces fabricated coverage with real per-file metrics and trend tracking
Agentic QE v3.8.10 fixes bogus coverage scoring and switches quality gates to real per-file metrics with trend tracking. The release [v3.8.10](https:...
SWE-CI shifts agent evaluation from one-shot bug fixes to CI-driven maintainability
A new CI-loop benchmark, SWE-CI, measures whether AI coding agents can maintain real repositories over time, not just pass one-off tests. [SWE-CI](ht...
AI lands across the DevOps stack: Sauce Labs tests, Harness security, and Java 26
AI is moving from hype to plumbing in DevOps, landing in testing, security, and even Java’s core runtime. [Sauce Labs released an AI agent for genera...
GitHub slopocalypse: lock down bots and plan CI failover
AI-generated repo noise and platform hiccups are forcing teams to lock down GitHub and build CI failovers. Jannis Leidel describes the "slopocalypse"...
Benchmarks Aren’t Shipping Code: How to Vet AI Code Agents Before CI
New evidence shows top-scoring AI coding tools pass benchmarks but stumble in real code review and day‑to‑day engineering workflows. METR reports tha...
AI agents can supercharge code, but deployment is the choke point
Coding agents are delivering real wins in code performance, but running that code safely in the cloud is the new bottleneck. An InfoWorld essay argue...
AI coding is jamming security queues because process, not tooling, is missing
A New Stack article argues two process failures with AI-generated code are clogging security review pipelines and slowing releases. The piece from Th...
Claude Code Review lands in GitHub Actions (preview) — real checks, real cost
Anthropic added a preview Claude Code Review GitHub Action that parallel-checks PRs, verifies findings, ranks severity, and bills purely on Claude API...
Agents ace one-shot coding, but most break your code over months—time to harden CI and adopt evaluator loops
New results say most coding agents cause regressions during long-term CI, and a new MassGen release adds built-in evaluator loops to catch issues earl...
One-scan repo context generation with codebase-md
Codebase-md scans your repo and auto-generates consistent AI coding context files for popular tools, reducing manual drift and improving prompt qualit...
Prompt injection poisons GitHub Actions cache and exfiltrates secrets in Cline incident
A prompt injection in Cline’s AI-powered GitHub issue triage poisoned shared caches and leaked release secrets, underscoring the need for CI/CD-grade ...
Cursor Automations brings policy-driven agents to your repo and Slack
Cursor launched Automations, a policy-driven system that triggers coding agents on commits, Slack messages, or schedules and loops humans in only when...
Copilot CLI 0.0.422 lands automation-friendly upgrades as VS Code previews agent plugins
GitHub shipped Copilot CLI 0.0.422 and VS Code previewed agent plugins, tightening how AI agents run across terminal, editor, and CI workflows. Copil...
Operationalizing Agent Evaluation: SWE-CI + MLflow + OTel Tracing
A new CI-loop benchmark and practical guidance on evaluation and observability outline how to move coding agents from pass/fail demos to production-gr...
Cursor’s reported $2B run rate shows AI-in-the-IDE is going default
Cursor’s AI code editor has reportedly hit a $2B annualized run rate, signaling that AI-in-the-IDE is shifting from novelty to default for many engine...
Agent log observability: MassGen v0.1.49 adds in-app analysis and fairness gating; research backs variable-aware parsing
Agent-log observability just improved with MassGen’s new in-app log analysis and fairness controls, while research shows variable-aware LLM log parsin...
MassGen v0.1.46 released
MassGen v0.1.46 is out — review the official GitHub release page before upgrading to ensure compatibility with your pipelines and tooling [MassGen v0....
Continue CLI beta ships daily with 7-day promote-to-stable cadence
The Continue CLI daily beta v1.5.43-beta.20260203 is out on [GitHub](https://github.com/continuedev/continue/releases/tag/v1.5.43-beta.20260203)[^1], ...
Agentic AI hits production: MCP evals meet Clawdbot-scale autonomy
Agentic AI is moving from chat to action, making end-to-end, tool-trajectory evaluations essential; Toloka’s MCP evaluations add sprint-ready, human-i...
OpenAI Codex agent loop goes from suggestions to sandboxed, auditable code changes
OpenAI’s Codex now uses an iterative agent loop that plans, calls tools, and executes in air‑gapped containers with quotas—returning JSON‑logged diffs...