// Agent Primitives // This is a really interesting take on building effective multi-agent systems. Multi-agent systems get more complex as tasks get harder. More roles, more prompts, more bespoke interaction patterns. However, the core computation patterns keep repeating across every system: review, vote, plan, execute. But nobody treats these patterns as reusable building blocks. This new research introduces Agent Primitives, a set of latent building blocks for constructing effective multi-agent systems. Inspired by how neural networks are built from reusable components like residual blocks and attention heads, the researchers decompose multi-agent architectures into three recurring primitives: Review, Voting and Selection, and Planning and Execution. What makes these primitives different? Agents inside each primitive communicate via KV-cache rather than natural language. This avoids the information degradation that happens when agents pass long text messages back and forth across multi-stage interactions. An Organizer agent selects and composes primitives for each query, guided by a lightweight knowledge pool of previously successful configurations. No manual system design required. The results across eight benchmarks spanning math, code generation, and QA with five open-source LLMs: > Primitives-based MAS improve average accuracy by 12.0-16.5% over single-agent baselines > On GPQA-Diamond, the improvement is striking, 53.2% versus the 33.6-40.2% range of prior methods like AgentVerse, DyLAN, and MAS-GPT In terms of efficiency, token usage and inference latency drop by approximately 3-4x compared to text-based MAS, while incurring only 1.3-1.6x overhead relative to single-agent inference. Instead of designing task-specific multi-agent architectures from scratch, Agent Primitives show that a small set of reusable computation patterns with latent communication can match or exceed custom systems while being dramatically more efficient. Paper: ...