<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>BreakingAgent</title><description>Independent intelligence on agentic AI: what is changing, what matters, and what builders should do next.</description><link>https://breakingagent.com/</link><language>en-us</language><item><title>Agentic Stories Podcast Launches Daily Agent Briefings</title><link>https://breakingagent.com/news/agentic-stories-podcast-launches-daily-agent-briefings/</link><guid isPermaLink="true">https://breakingagent.com/news/agentic-stories-podcast-launches-daily-agent-briefings/</guid><description>New podcast delivers weekday updates on AI agent economy, governance, and deployment.</description><pubDate>Fri, 08 May 2026 07:46:42 GMT</pubDate><category>governance</category><category>observability</category><category>podcast</category></item><item><title>Agentic AI Shift Tops 2026 Stories Over Models</title><link>https://breakingagent.com/news/agentic-ai-shift-tops-2026-stories-over-models/</link><guid isPermaLink="true">https://breakingagent.com/news/agentic-ai-shift-tops-2026-stories-over-models/</guid><description>Experts declare move from models to full agent systems as year&apos;s biggest AI development.</description><pubDate>Fri, 08 May 2026 07:46:42 GMT</pubDate><category>trend</category><category>systems</category></item><item><title>Pentagon Cuts Anthropic Ties Over Agent Terms</title><link>https://breakingagent.com/news/pentagon-cuts-anthropic-ties-over-agent-terms/</link><guid isPermaLink="true">https://breakingagent.com/news/pentagon-cuts-anthropic-ties-over-agent-terms/</guid><description>DOD dispute with Anthropic prompts new AI deals with Nvidia, MSFT, AWS for classified agents.</description><pubDate>Fri, 08 May 2026 07:46:42 GMT</pubDate><category>policy</category><category>military</category><category>vendor</category></item><item><title>pydantic-ai 1.92.0 released</title><link>https://breakingagent.com/news/pydantic-ai-1-92-0-release/</link><guid isPermaLink="true">https://breakingagent.com/news/pydantic-ai-1-92-0-release/</guid><description>Pydantic AI 1.92.0 introduces Anthropic task budget support and runtime `output_retries` override with deprecation of the old `retries` field, enhancing control over AI agent execution and reliability. It also fixes key bugs like streaming response cleanup on cancellation, MCP session task isolation to prevent exit scope errors, and proper population of `RunContext` with run/conversation IDs and metadata.</description><pubDate>Fri, 08 May 2026 06:47:27 GMT</pubDate><category>pydantic-ai</category><category>releases</category></item><item><title>Agentic Stories Podcast Covers Governance News</title><link>https://breakingagent.com/news/agentic-stories-podcast-covers-governance-news/</link><guid isPermaLink="true">https://breakingagent.com/news/agentic-stories-podcast-covers-governance-news/</guid><description>Daily briefing launches on AI agent economy, emphasizing governance, security, and deployment challenges.</description><pubDate>Fri, 08 May 2026 06:47:23 GMT</pubDate><category>governance</category><category>observability</category></item><item><title>Anthropic-Pentagon Stalemate on Claude Usage</title><link>https://breakingagent.com/news/anthropic-pentagon-stalemate-on-claude-usage/</link><guid isPermaLink="true">https://breakingagent.com/news/anthropic-pentagon-stalemate-on-claude-usage/</guid><description>Anthropic and DoD reach impasse over deploying Claude model in defense applications amid policy concerns.</description><pubDate>Fri, 08 May 2026 06:47:23 GMT</pubDate><category>policy</category><category>government</category></item><item><title>Banking Vet Launches Enterprise Primitive AI Agents</title><link>https://breakingagent.com/news/banking-vet-launches-enterprise-primitive-ai-agents/</link><guid isPermaLink="true">https://breakingagent.com/news/banking-vet-launches-enterprise-primitive-ai-agents/</guid><description>Former banking executive unveils enterprise-grade system for primitive AI agents targeting business automation.</description><pubDate>Fri, 08 May 2026 06:47:23 GMT</pubDate><category>funding</category><category>enterprise</category></item><item><title>Clawdbot Viral Launch Sells Out Mac Minis</title><link>https://breakingagent.com/news/clawdbot-viral-launch-sells-out-mac-minis/</link><guid isPermaLink="true">https://breakingagent.com/news/clawdbot-viral-launch-sells-out-mac-minis/</guid><description>Open-source OpenClaw variant sparks hardware rush for local agent execution amid privacy concerns.</description><pubDate>Fri, 08 May 2026 06:47:23 GMT</pubDate><category>open-source</category><category>hardware</category></item><item><title>OpenAI Hires OpenClaw Creator Peter Steinberger</title><link>https://breakingagent.com/news/openai-hires-openclaw-creator-peter-steinberger/</link><guid isPermaLink="true">https://breakingagent.com/news/openai-hires-openclaw-creator-peter-steinberger/</guid><description>OpenAI recruits key developer of open-source agentic framework enabling autonomous keyboard/mouse control.</description><pubDate>Fri, 08 May 2026 06:47:23 GMT</pubDate><category>hiring</category><category>agent-framework</category></item><item><title>AWS Launches AgentCore Payments — Agents Can Now Transact with Coinbase and Stripe</title><link>https://breakingagent.com/news/aws-agentcore-payments-coinbase-stripe/</link><guid isPermaLink="true">https://breakingagent.com/news/aws-agentcore-payments-coinbase-stripe/</guid><description>Amazon Bedrock AgentCore now lets autonomous agents make payments via stablecoin micropayments, built with Coinbase x402 and Stripe Privy wallet infrastructure.</description><pubDate>Thu, 07 May 2026 21:33:00 GMT</pubDate><category>payments</category><category>infrastructure</category><category>multi-agent</category></item><item><title>[Research] Dynamic In-Context Example Selection for Reliable Agentic Reasoning</title><link>https://breakingagent.com/research/dynamic-in-context-example-selection-for-reliable-agentic-re/</link><guid isPermaLink="true">https://breakingagent.com/research/dynamic-in-context-example-selection-for-reliable-agentic-re/</guid><description>A theoretically grounded method for agents to dynamically select optimal in-context examples during reasoning, boosting reliability across diverse tasks.</description><pubDate>Thu, 07 May 2026 20:04:33 GMT</pubDate><category>planning</category><category>eval</category></item><item><title>[Research] ToolMemory: Long-Term Memory Management for Agentic Workflows</title><link>https://breakingagent.com/research/toolmemory-long-term-memory-management-for-agentic-workflows/</link><guid isPermaLink="true">https://breakingagent.com/research/toolmemory-long-term-memory-management-for-agentic-workflows/</guid><description>Framework enabling agents to maintain tool-specific memory across extended conversations, pruning irrelevance while preserving critical knowledge.</description><pubDate>Thu, 07 May 2026 20:04:33 GMT</pubDate><category>memory</category><category>tool-use</category></item><item><title>Claude API Adds Streaming for High-Throughput Agents</title><link>https://breakingagent.com/news/claude-api-adds-streaming-for-high-throughput-agents/</link><guid isPermaLink="true">https://breakingagent.com/news/claude-api-adds-streaming-for-high-throughput-agents/</guid><description>New streaming and batching endpoints in Claude API optimize for agentic deployments requiring real-time processing.</description><pubDate>Thu, 07 May 2026 20:04:22 GMT</pubDate><category>tool-use</category><category>observability</category></item><item><title>Mistral Small 4 Tops Reasoning Benchmarks for Agent Use</title><link>https://breakingagent.com/news/mistral-small-4-tops-reasoning-benchmarks-for-agent-use/</link><guid isPermaLink="true">https://breakingagent.com/news/mistral-small-4-tops-reasoning-benchmarks-for-agent-use/</guid><description>22B-parameter Mistral Small 4 outperforms larger closed models on reasoning and instruction benchmarks critical for agents.</description><pubDate>Thu, 07 May 2026 20:04:22 GMT</pubDate><category>model releases</category><category>tool-use</category></item><item><title>NVIDIA GTC Confirms Enterprise Agentic Production Deployments</title><link>https://breakingagent.com/news/nvidia-gtc-confirms-enterprise-agentic-production-deployment/</link><guid isPermaLink="true">https://breakingagent.com/news/nvidia-gtc-confirms-enterprise-agentic-production-deployment/</guid><description>NVIDIA&apos;s GTC 2026 showcased Fortune 500 companies running agentic AI systems in production using NeMoCLAW and OpenCLAW frameworks.</description><pubDate>Thu, 07 May 2026 20:04:22 GMT</pubDate><category>agent frameworks</category><category>enterprise</category></item><item><title>MCP Agent Framework Hits 97M Installs Milestone</title><link>https://breakingagent.com/news/mcp-agent-framework-hits-97m-installs-milestone/</link><guid isPermaLink="true">https://breakingagent.com/news/mcp-agent-framework-hits-97m-installs-milestone/</guid><description>March 25 stats reveal MCP, a key agentic infrastructure standard, reached 97 million installs, transforming agent development.</description><pubDate>Thu, 07 May 2026 20:04:22 GMT</pubDate><category>agent frameworks</category><category>adoption</category></item><item><title>OpenCLAW Released as Open-Source Agent Orchestration Framework</title><link>https://breakingagent.com/news/openclaw-released-as-open-source-agent-orchestration-framewo/</link><guid isPermaLink="true">https://breakingagent.com/news/openclaw-released-as-open-source-agent-orchestration-framewo/</guid><description>Apache 2.0-licensed OpenCLAW launches as companion to NVIDIA&apos;s NeMoCLAW for enterprise multi-agent systems.</description><pubDate>Thu, 07 May 2026 20:04:22 GMT</pubDate><category>agent frameworks</category><category>open source</category></item><item><title>Five Eyes Warns on Agentic AI Risks</title><link>https://breakingagent.com/news/five-eyes-warns-on-agentic-ai-risks/</link><guid isPermaLink="true">https://breakingagent.com/news/five-eyes-warns-on-agentic-ai-risks/</guid><description>Security agencies urge caution in deploying autonomous AI agents across business systems.</description><pubDate>Thu, 07 May 2026 19:14:14 GMT</pubDate><category>agent policy</category><category>safety</category></item><item><title>HPE Deploys Autonomous Networking Agents</title><link>https://breakingagent.com/news/hpe-deploys-autonomous-networking-agents/</link><guid isPermaLink="true">https://breakingagent.com/news/hpe-deploys-autonomous-networking-agents/</guid><description>Self-driving agents optimize enterprise networks and cut tickets by 75%.</description><pubDate>Thu, 07 May 2026 19:14:14 GMT</pubDate><category>workflow</category><category>rpa</category></item><item><title>Palo Alto Acquires Portkey for Agent Security</title><link>https://breakingagent.com/news/palo-alto-acquires-portkey-for-agent-security/</link><guid isPermaLink="true">https://breakingagent.com/news/palo-alto-acquires-portkey-for-agent-security/</guid><description>Portkey&apos;s gateway protects autonomous agents processing trillions of tokens.</description><pubDate>Thu, 07 May 2026 19:14:14 GMT</pubDate><category>observability</category><category>safety</category></item><item><title>UiPath Adds Agentic Automation to Self-Hosted Suite</title><link>https://breakingagent.com/news/uipath-adds-agentic-automation-to-self-hosted-suite/</link><guid isPermaLink="true">https://breakingagent.com/news/uipath-adds-agentic-automation-to-self-hosted-suite/</guid><description>Agentic AI now available for on-prem environments in regulated sectors.</description><pubDate>Thu, 07 May 2026 19:14:14 GMT</pubDate><category>agent frameworks</category><category>rpa</category></item><item><title>[Research] Adaptation of Agentic AI: A Survey of Post-Training, Memory, and Skills</title><link>https://breakingagent.com/research/adaptation-of-agentic-ai-a-survey-of-post-training-memory-an/</link><guid isPermaLink="true">https://breakingagent.com/research/adaptation-of-agentic-ai-a-survey-of-post-training-memory-an/</guid><description>Comprehensive survey examining how agentic AI systems adapt through post-training, memory architectures, and skill acquisition for long-horizon task execution.</description><pubDate>Thu, 07 May 2026 18:56:32 GMT</pubDate><category>memory-architectures</category><category>long-horizon-tasks</category><category>adaptation</category><category>in-context-learning</category><category>skill-composition</category></item><item><title>[Research] How Agentic AI Changes the Economics of Enterprise Software</title><link>https://breakingagent.com/research/how-agentic-ai-changes-the-economics-of-enterprise-software/</link><guid isPermaLink="true">https://breakingagent.com/research/how-agentic-ai-changes-the-economics-of-enterprise-software/</guid><description>Research on how agentic coding systems reshape make-or-buy decisions by dramatically reducing development timelines and CAPEX for enterprise applications.</description><pubDate>Thu, 07 May 2026 18:56:32 GMT</pubDate><category>agent-evaluation</category><category>benchmarks</category><category>deployment-patterns</category><category>tool-use</category><category>real-world-applications</category></item><item><title>[Research] Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours</title><link>https://breakingagent.com/research/redefining-ai-red-teaming-in-the-agentic-era-from-weeks-to-h/</link><guid isPermaLink="true">https://breakingagent.com/research/redefining-ai-red-teaming-in-the-agentic-era-from-weeks-to-h/</guid><description>Framework for automating adversarial testing of agentic systems using AI-driven red teaming agents that generate workflows from 45+ attacks, 450+ transforms, and 130+ scorers.</description><pubDate>Thu, 07 May 2026 18:56:32 GMT</pubDate><category>safety</category><category>red-teaming</category><category>adversarial-testing</category><category>multi-agent-systems</category><category>automation</category></item><item><title>Clawdbot Open-Source Agent Drives Mac Mini Hardware Shortage</title><link>https://breakingagent.com/news/clawdbot-open-source-agent-drives-mac-mini-hardware-shortage/</link><guid isPermaLink="true">https://breakingagent.com/news/clawdbot-open-source-agent-drives-mac-mini-hardware-shortage/</guid><description>An open-source version of OpenClaw called Clawdbot went viral, causing Apple Mac Minis to sell out as users rushed to purchase always-on hardware for local agent deployment.</description><pubDate>Thu, 07 May 2026 18:52:44 GMT</pubDate><category>open-source</category><category>agent-deployment</category><category>hardware</category><category>privacy</category></item><item><title>[Research] Agentic Coding for SecOps: Torq Agentic Builder</title><link>https://breakingagent.com/research/agentic-coding-for-secops-torq-agentic-builder/</link><guid isPermaLink="true">https://breakingagent.com/research/agentic-coding-for-secops-torq-agentic-builder/</guid><description>Production-grade agentic AI system for security operations that transforms natural language intent into executable agents through contextual analysis, planning, and automated testing.</description><pubDate>Thu, 07 May 2026 18:45:39 GMT</pubDate><category>agent-engineering</category><category>planning</category><category>testing</category><category>deployment-patterns</category><category>tool-use</category><category>safety</category></item><item><title>[Research] 10 Agentic Commerce Research Papers Shaping the Future of Enterprise Product Discovery</title><link>https://breakingagent.com/research/10-agentic-commerce-research-papers-shaping-the-future-of-en/</link><guid isPermaLink="true">https://breakingagent.com/research/10-agentic-commerce-research-papers-shaping-the-future-of-en/</guid><description>Meta-analysis of 2025 agentic commerce research, including empirical findings on agent purchasing behavior, position bias, and the modular retrieval-first architectures that enable reliable shopping agents.</description><pubDate>Thu, 07 May 2026 18:45:39 GMT</pubDate><category>agent-evaluation</category><category>tool-use</category><category>multi-step-tasks</category><category>failure-modes</category><category>orchestration</category><category>retrieval-augmented</category></item><item><title>[Research] The Adoption and Usage of AI Agents</title><link>https://breakingagent.com/research/the-adoption-and-usage-of-ai-agents/</link><guid isPermaLink="true">https://breakingagent.com/research/the-adoption-and-usage-of-ai-agents/</guid><description>Comprehensive empirical study of agentic AI system adoption patterns, market sizing, and real-world deployment challenges across enterprise and consumer segments.</description><pubDate>Thu, 07 May 2026 18:45:39 GMT</pubDate><category>adoption</category><category>market-analysis</category><category>deployment-patterns</category><category>tool-use</category><category>multi-step-actions</category></item><item><title>[Research] AuditRepairBench: A Paired-Execution Trace Corpus for Evaluator-Channel Ranking Instability in Agent Repair</title><link>https://breakingagent.com/research/auditrepairbench-a-paired-execution-trace-corpus-for-evaluat/</link><guid isPermaLink="true">https://breakingagent.com/research/auditrepairbench-a-paired-execution-trace-corpus-for-evaluat/</guid><description>New benchmark exposes ranking instability in agent repair leaderboards due to evaluator reconfiguration, enabling more reliable evaluation of AI agent debugging capabilities.</description><pubDate>Thu, 07 May 2026 18:39:27 GMT</pubDate><category>agent evaluation</category><category>benchmarks</category><category>repair</category></item><item><title>ServiceTrade Unveils Stella AI Agents for Field Service</title><link>https://breakingagent.com/news/servicetrade-unveils-stella-ai-agents-for-field-service/</link><guid isPermaLink="true">https://breakingagent.com/news/servicetrade-unveils-stella-ai-agents-for-field-service/</guid><description>ServiceTrade launches Stella suite with Quote and Schedule agents to automate field operations.</description><pubDate>Thu, 07 May 2026 18:29:21 GMT</pubDate><category>field-service</category><category>automation</category></item><item><title>[Research] AgentEval: A Comprehensive Benchmark for Evaluating Long-Horizon Agentic Workflows with Real-World Failure Modes</title><link>https://breakingagent.com/research/agenteval-a-comprehensive-benchmark-for-evaluating-long-hori/</link><guid isPermaLink="true">https://breakingagent.com/research/agenteval-a-comprehensive-benchmark-for-evaluating-long-hori/</guid><description>New benchmark reveals critical gaps in agent tool-use reliability and proposes verifier architectures to boost success rates by 28% on multi-step tasks.</description><pubDate>Thu, 07 May 2026 18:21:47 GMT</pubDate><category>agent-evaluation</category><category>tool-use</category><category>planning</category><category>verifiers</category></item><item><title>[Research] Agentic AI for Robot Control: Flexible but still Fragile</title><link>https://breakingagent.com/research/agentic-ai-for-robot-control-flexible-but-still-fragile/</link><guid isPermaLink="true">https://breakingagent.com/research/agentic-ai-for-robot-control-flexible-but-still-fragile/</guid><description>Research on LLM-based agentic control systems for robots reveals architecture patterns for reasoning and execution, but exposes brittleness under real-world constraints.</description><pubDate>Thu, 07 May 2026 18:10:07 GMT</pubDate><category>robot-control</category><category>tool-use</category><category>planning</category><category>failure-modes</category><category>real-world-deployment</category><category>observability</category></item><item><title>[Research] REPRO-Bench: Can Agentic AI Systems Assess the Reproducibility of Social Science Research?</title><link>https://breakingagent.com/research/repro-bench-can-agentic-ai-systems-assess-the-reproducibilit/</link><guid isPermaLink="true">https://breakingagent.com/research/repro-bench-can-agentic-ai-systems-assess-the-reproducibilit/</guid><description>New benchmark evaluates whether agentic AI can reliably assess reproducibility in social science papers, revealing key strengths and failure modes.</description><pubDate>Thu, 07 May 2026 18:09:32 GMT</pubDate><category>agent evaluation</category><category>benchmarks</category></item><item><title>[Research] ACON: Optimizing Context Compression for Long-horizon LLM Agents</title><link>https://breakingagent.com/research/acon-optimizing-context-compression-for-long-horizon-llm-age/</link><guid isPermaLink="true">https://breakingagent.com/research/acon-optimizing-context-compression-for-long-horizon-llm-age/</guid><description>A new method for compressing context in long-horizon LLM agents to reduce token overhead while maintaining planning performance.</description><pubDate>Thu, 07 May 2026 18:04:06 GMT</pubDate><category>context-compression</category><category>long-horizon-planning</category><category>memory-efficiency</category><category>token-optimization</category></item><item><title>[Research] CriticFlow: Multi-Agent Verifier Orchestration for Robust Long-Horizon Agent Planning</title><link>https://breakingagent.com/research/criticflow-multi-agent-verifier-orchestration-for-robust-lon/</link><guid isPermaLink="true">https://breakingagent.com/research/criticflow-multi-agent-verifier-orchestration-for-robust-lon/</guid><description>New multi-agent verification framework dramatically improves planning reliability in long-horizon tasks through dynamic critic handoff and failure prediction.</description><pubDate>Thu, 07 May 2026 17:52:37 GMT</pubDate><category>planning</category><category>multi-agent</category><category>verification</category><category>long-horizon</category></item><item><title>[Research] Anemoi Agent: A2A Communication for Scalable Multi-Agent Coordination</title><link>https://breakingagent.com/research/anemoi-agent-a2a-communication-for-scalable-multi-agent-coor/</link><guid isPermaLink="true">https://breakingagent.com/research/anemoi-agent-a2a-communication-for-scalable-multi-agent-coor/</guid><description>Agent-to-agent communication server replaces context-stuffing with direct coordination, achieving 52.73% accuracy on GAIA with smaller models.</description><pubDate>Thu, 07 May 2026 17:46:33 GMT</pubDate><category>multi-agent-coordination</category><category>agent-communication</category><category>cost-optimization</category><category>planning</category><category>benchmark-evaluation</category></item><item><title>[Research] CriticLM: A Verifier for Reliable Agentic Planning</title><link>https://breakingagent.com/research/criticlm-a-verifier-for-reliable-agentic-planning/</link><guid isPermaLink="true">https://breakingagent.com/research/criticlm-a-verifier-for-reliable-agentic-planning/</guid><description>New benchmark and LLM-based critic architecture that catches 73% more planning errors in long-horizon agent tasks than prior verification methods.</description><pubDate>Thu, 07 May 2026 09:36:12 GMT</pubDate><category>planning</category><category>verifier</category><category>eval</category><category>reliability</category></item><item><title>CORAS.ai Ships Agentic Reporting for Defense, Replaces BI Tools</title><link>https://breakingagent.com/news/coras-ai-ships-agentic-reporting-for-defense-replaces-bi-too/</link><guid isPermaLink="true">https://breakingagent.com/news/coras-ai-ships-agentic-reporting-for-defense-replaces-bi-too/</guid><description>CORAS.ai launches agentic AI reporting platform on May 5, consolidating defense BI systems into one IL5 tool.</description><pubDate>Thu, 07 May 2026 08:38:53 GMT</pubDate><category>infrastructure</category><category>defense</category></item><item><title>Anthropic Secures xAI&apos;s Colossus-1 Compute in Surprise Cross-Rival Deal</title><link>https://breakingagent.com/news/anthropic-spacex-xai-colossus-compute-deal/</link><guid isPermaLink="true">https://breakingagent.com/news/anthropic-spacex-xai-colossus-compute-deal/</guid><description>Anthropic has signed an agreement with SpaceX to access all 300MW of compute capacity at xAI&apos;s Colossus 1 data centre in Memphis, immediately raising usage limits for Claude Pro, Max, and API subscribers.</description><pubDate>Wed, 06 May 2026 16:41:40 GMT</pubDate><category>anthropic</category><category>xai</category><category>spacex</category><category>compute</category><category>infrastructure</category><category>claude</category></item><item><title>autogen python-v0.7.5 released</title><link>https://breakingagent.com/news/autogen-python-v0-7-5-release/</link><guid isPermaLink="true">https://breakingagent.com/news/autogen-python-v0-7-5-release/</guid><description>AutoGen v0.7.5 adds linear memory support in RedisMemory, enabling more scalable and efficient long‑running agent conversations. It also introduces thinking mode for the Anthropic client and fixes several streaming, tool‑call, and correlation issues that improve reliability and performance for agent builders.</description><pubDate>Wed, 06 May 2026 10:05:50 GMT</pubDate><category>autogen</category><category>releases</category></item><item><title>crewai 1.14.4 released</title><link>https://breakingagent.com/news/crewai-1-14-4-release/</link><guid isPermaLink="true">https://breakingagent.com/news/crewai-1-14-4-release/</guid><description>CrewAI 1.14.4 introduces enhanced cloud provider support with custom persistence keys for @persist, Responses API for Azure OpenAI, and new search/research tools via Tavily and You.com MCP integration. The release also includes critical bug fixes for JSON parsing, tool call preservation, and multimodal input handling, improving reliability for production agent deployments.</description><pubDate>Wed, 06 May 2026 10:05:50 GMT</pubDate><category>crewai</category><category>releases</category></item><item><title>langgraph sdk==0.3.14 released</title><link>https://breakingagent.com/news/langgraph-sdk-0-3-14-release/</link><guid isPermaLink="true">https://breakingagent.com/news/langgraph-sdk-0-3-14-release/</guid><description>LangGraph SDK 0.3.14 introduces a `return_minimal` parameter for threads update operations, enabling more efficient API responses for AI agent builders. The release also includes streaming transformer infrastructure and support for `stream_events(version=&apos;v3&apos;)` on Pregel, providing enhanced control over event streaming in agent workflows.</description><pubDate>Wed, 06 May 2026 10:05:50 GMT</pubDate><category>langgraph</category><category>releases</category></item><item><title>letta 0.16.7 released</title><link>https://breakingagent.com/news/letta-0-16-7-release/</link><guid isPermaLink="true">https://breakingagent.com/news/letta-0-16-7-release/</guid><description>Letta 0.16.7 raises the default global context window from 32k to 128k and fixes the context window reset bug, with a completely overhauled compaction system that eliminates most manual configuration workarounds for self-hosted users. Block limits are no longer enforced, allowing blocks to grow freely, though users must now manage block size through alternative means if they were previously relying on limits to control per-turn costs.</description><pubDate>Wed, 06 May 2026 10:05:50 GMT</pubDate><category>letta</category><category>releases</category></item><item><title>Anthropic Zero-Day Flaw Exposes 200K AI Agent Servers</title><link>https://breakingagent.com/news/anthropic-zero-day-flaw-exposes-200k-ai-agent-servers/</link><guid isPermaLink="true">https://breakingagent.com/news/anthropic-zero-day-flaw-exposes-200k-ai-agent-servers/</guid><description>Critical vulnerability in Anthropic&apos;s Model Context Protocol triggers $25B security overhaul with Amazon.</description><pubDate>Tue, 05 May 2026 22:09:04 GMT</pubDate><category>security</category><category>vulnerability</category><category>infrastructure</category></item><item><title>NVIDIA Launches Nemotron 3 Nano Omni Unified Agent Model</title><link>https://breakingagent.com/news/nvidia-launches-nemotron-3-nano-omni-unified-agent-model/</link><guid isPermaLink="true">https://breakingagent.com/news/nvidia-launches-nemotron-3-nano-omni-unified-agent-model/</guid><description>NVIDIA releases Nemotron 3 Nano Omni, unifying vision, audio, and language for faster AI agent processing.</description><pubDate>Tue, 05 May 2026 22:09:04 GMT</pubDate><category>model-release</category><category>multi-modal</category><category>agents</category></item><item><title>Anthropic moves Computer Use out of beta, ships native sandbox primitive</title><link>https://breakingagent.com/news/anthropic-computer-use-ga/</link><guid isPermaLink="true">https://breakingagent.com/news/anthropic-computer-use-ga/</guid><description>Claude&apos;s screen-grounded agent loop graduates with new tool-use primitives, an isolated sandbox, and tighter rate-limit policy for production deployments.</description><pubDate>Wed, 22 Apr 2026 09:30:00 GMT</pubDate><category>anthropic</category><category>computer-use</category><category>browser-agents</category><category>sandbox</category></item><item><title>OpenAI ships Swarm 2 with built-in handoff tracing and per-agent budgets</title><link>https://breakingagent.com/news/openai-swarm-2-multi-agent/</link><guid isPermaLink="true">https://breakingagent.com/news/openai-swarm-2-multi-agent/</guid><description>Swarm 2 introduces a structured handoff log, hard token budgets per agent, and an interoperability shim for LangGraph and CrewAI.</description><pubDate>Sun, 19 Apr 2026 16:05:00 GMT</pubDate><category>openai</category><category>multi-agent</category><category>orchestration</category><category>tracing</category></item><item><title>[Research] Reflexion, three years on: what self-critique still buys you</title><link>https://breakingagent.com/research/reflexion-revisited/</link><guid isPermaLink="true">https://breakingagent.com/research/reflexion-revisited/</guid><description>A meta-analysis of 41 papers building on Reflexion-style self-critique loops finds modest, durable gains in coding and tool-use, and diminishing returns in open-ended reasoning.</description><pubDate>Sat, 18 Apr 2026 10:00:00 GMT</pubDate><category>self-critique</category><category>reflexion</category><category>meta-analysis</category></item><item><title>Google opens Gemini Agent SDK with first-party MCP server registry</title><link>https://breakingagent.com/news/google-gemini-agent-sdk/</link><guid isPermaLink="true">https://breakingagent.com/news/google-gemini-agent-sdk/</guid><description>The Agent SDK ships with a curated MCP registry, native long-running task support, and managed memory tied to Vertex AI.</description><pubDate>Wed, 15 Apr 2026 11:00:00 GMT</pubDate><category>google</category><category>gemini</category><category>mcp</category><category>sdk</category></item><item><title>[Research] Long-horizon memory: survey of seven architectures, ranked by recall and cost</title><link>https://breakingagent.com/research/long-horizon-memory-survey/</link><guid isPermaLink="true">https://breakingagent.com/research/long-horizon-memory-survey/</guid><description>Compares episodic, semantic, hybrid, and graph-based memory across realistic 30-day agent simulations. Hybrid stores win on recall; graph stores win on cost stability.</description><pubDate>Tue, 14 Apr 2026 09:30:00 GMT</pubDate><category>memory</category><category>long-horizon</category><category>survey</category></item><item><title>SWE-bench Verified hits 78%, prompting calls for a harder coding eval</title><link>https://breakingagent.com/news/swe-bench-verified-saturated/</link><guid isPermaLink="true">https://breakingagent.com/news/swe-bench-verified-saturated/</guid><description>Top coding agents now resolve more than three of every four tasks in SWE-bench Verified, reigniting debate over whether the benchmark still discriminates between systems.</description><pubDate>Sun, 12 Apr 2026 08:00:00 GMT</pubDate><category>benchmarks</category><category>evaluation</category><category>coding-agents</category></item><item><title>EU AI Office issues draft guidance on autonomous agent disclosures</title><link>https://breakingagent.com/news/eu-ai-act-agent-guidance/</link><guid isPermaLink="true">https://breakingagent.com/news/eu-ai-act-agent-guidance/</guid><description>The draft requires clear disclosure when agents act on a user&apos;s behalf in regulated transactions, plus an audit log requirement for high-risk deployments.</description><pubDate>Thu, 09 Apr 2026 14:25:00 GMT</pubDate><category>regulation</category><category>eu-ai-act</category><category>governance</category><category>compliance</category></item><item><title>[Research] Six failure modes in tool-using agents, and the patterns that fix them</title><link>https://breakingagent.com/research/tool-use-failure-modes/</link><guid isPermaLink="true">https://breakingagent.com/research/tool-use-failure-modes/</guid><description>An empirical taxonomy of agent tool-use failures across 4,000 traces from production deployments. Schema drift and silent partial-failure dominate.</description><pubDate>Wed, 08 Apr 2026 13:15:00 GMT</pubDate><category>tool-use</category><category>failure-modes</category><category>production</category></item><item><title>[Research] Decoupled planner-critic agents outperform monolithic planners on long tasks</title><link>https://breakingagent.com/research/planner-critic-decoupling/</link><guid isPermaLink="true">https://breakingagent.com/research/planner-critic-decoupling/</guid><description>Splitting planning and critique into specialized models with structured exchange yields a 14-point lift on multi-day research tasks.</description><pubDate>Sat, 04 Apr 2026 10:00:00 GMT</pubDate><category>planning</category><category>critic</category><category>architecture</category></item><item><title>[Research] The case for replay-based agent evaluation</title><link>https://breakingagent.com/research/agent-eval-replay-sets/</link><guid isPermaLink="true">https://breakingagent.com/research/agent-eval-replay-sets/</guid><description>Static benchmarks miss the failure modes that matter in production. This paper argues for replay sets — captured user sessions scored against a held-out outcome.</description><pubDate>Mon, 30 Mar 2026 08:45:00 GMT</pubDate><category>evaluation</category><category>replay</category><category>production</category></item></channel></rss>