XAI
30 days · UTC
Synchronizing with global intelligence nodes...
Choosing the right frontier model by workflow: compliance, agents, and file-heavy work
Model choice now hinges on whether you need strict instruction compliance, agent-style execution, or heavy file/long-document work. A head-to-head on...
Signal check: Grok 5 rumors and coding‑LLM noise—optimize your evals, not your hype
Grok 5 chatter is loud, but there’s no verified release—treat coding‑LLM claims as speculative and keep your evaluation pipeline sharp. A detailed bl...
Continue IDE updates: wider model support, prompt caching, cost routing, and stability hardening
Continue shipped coordinated VS Code and JetBrains releases adding broader model support, caching, cost routing, and notable stability fixes. The Jet...
Top LLMs split on tiers and naming: what that means for cost, routing, and long jobs
Vendors now expose high‑end LLMs with different tiers and names, which changes how you budget, route jobs, and handle long or tool‑heavy tasks. A dee...
Coding LLMs, March 2026: default to Sonnet 4.6, escalate to GPT-5.4, watch scaffold-driven benchmarks
March 2026 coding LLM benchmarks show mid-tier models rival flagships, but scaffolding and cost drive real-world choices. The latest multi-benchmark ...
Usable Context, Not Token Hype: How to pick and harden LLMs for long docs and agents
Choosing an LLM for long context and agents comes down to usable context and safety, not headline token counts. A careful comparison argues that cont...
Realtime LLMs: OpenAI ships gpt-realtime-1.5, benchmarks reframe “fast,” Grok shows capacity strain
OpenAI’s gpt-realtime-1.5 went live as new analysis and incidents reset expectations for real-time LLM speed, streaming, and reliability. OpenAI anno...
How Grok actually does real-time retrieval (and what its X link really means)
xAI’s Grok uses a tool-called retrieval pipeline and tight X integration to produce live, cited answers with clear limits and audit trails. The Grok ...
Grok 4.1 Free: Treat as access, not capacity
Treat Grok 4.1 Free as an entry point for testing realtime-first workflows, not as a guaranteed capacity tier for sustained, iterative workloads. [Gro...
AI agents under attack: prompt injection exploits and new defenses
Enterprises deploying AI assistants and desktop agents face real prompt-injection and safety failures in tools like Copilot, ChatGPT, Grok, and OpenCl...
Unverified claim: Grok 4.20 (beta) discovered a new Bellman function
Community posts and a video claim xAI’s Grok 4.20 (beta) produced a new Bellman function, citing University of California, Irvine, but there is no off...
LangChain xAI 1.2.0 improves streaming and token accounting; OpenAI adapter updates GPT-5 limits
LangChain released langchain-xai 1.2.0 with fixes that stream citations only once and enable usage metadata streaming by default, plus a core serializ...