My Agents Talked Too Much
Last week I noticed something strange. My Hermes agents — Charly, Sara, Fred, and the rest — were having conversations that looked more like polite dinner party chatter than efficient machine communication. Every time they passed a task, they’d wrap it in pleasantries, clarifications, and natural language padding. It worked. But it wasn’t efficient.
When you run a multi-agent homelab, every token counts. Not just for cost — my local Ollama setup handles the volume fine — but for speed and precision. A 30-second agent-to-agent handshake becomes 45 seconds when half the tokens are filler. And filler introduces ambiguity. “I think maybe we should…” is not how machines should talk to machines.
The 80% Problem
I did the math on a typical agent-to-agent exchange. Out of ~120 tokens in a natural language handoff, roughly 65% was filler — articles, politeness, restatement of context both agents already shared. Structured function calls in JSON? That would take what, 30 tokens? Maybe 25 with the right design.
That’s when I started sketching what became HACP — the Hermes Agent Communication Protocol. Not another competing standard (there are already three things called “ACP” and Google’s A2A), but a focused, ultra-compact JSON format designed for communication inside my own ecosystem.
The idea is simple: every message has a type (t), a context (ctx), and optionally a confidence score, payload, and routing metadata. A proposal takes 12 fields max. An agreement takes 3. Total: ~25 tokens instead of ~120. That’s an 80% reduction — and more importantly, zero ambiguity. The receiving agent parses fields, not paragraphs.
Why Not Just Use A2A?
Good question. Google’s A2A protocol (now under the Linux Foundation) is excellent for what it does: enable agents from different ecosystems to discover each other, delegate tasks, and stream results. It’s the cross-framework interoperability layer — think of it as HTTP for agents. I’ve documented it extensively in my vault (see my A2A research from June 23).
But A2A has overhead by design. It needs to work across frameworks, vendors, and trust boundaries. That means 200+ tokens per message, JSON-RPC wrappers, agent discovery cards, and authentication schemes. Inside my homelab, where all agents run under the same Hermes umbrella and trust each other natively, that overhead is wasted.
The insight: A2A and HACP solve different problems. A2A is interoperable inter-framework communication. HACP is maximally compressed intra-ecosystem communication. They’re complementary, not competing — and I designed HACP with an A2A-compatibility mode so my bridge can translate between them when needed.
What the Protocol Looks Like
Here’s the core idea. An agent proposing an action sends this:
{"t": "proposal", "ctx": "pulse-protocol-v1.1", "s": 0.85, "p": {"action": "deploy", "target": "staging"}, "rq": "review", "src": "hermes", "dst": "charly"}
That’s 11 fields. The receiving agent sends back:
{"t": "agree", "ctx": "pulse-protocol-v1.1", "p": {"with_id": "msg_123"}, "src": "charly"}
Four fields. Under 25 tokens. And completely unambiguous: the t field tells you the intent at a glance, the s field communicates confidence, and structured payloads eliminate interpretation errors.
I also built a hybrid model: HACP for factual exchanges (status, data, proposals, accept/reject), natural language for creative work (brainstorms, complex trade-offs, open discussion). The bridge automatically detects the format and routes accordingly.
The Bigger Picture
This isn’t really about one protocol. It’s about thinking in layers. My vault (the knowledge layer), my agents (the action layer), the tools they use (MCP for tool access, A2A for cross-ecosystem, HACP for internal efficiency) — each layer has a clean interface and a specific purpose. That’s how good systems are built.
I’m still iterating on HACP. The next step is to teach my agents to speak it natively via system prompts, then measure actual token savings in production. But even as a concept, I’ve already learned something valuable: sometimes the best solution to a problem isn’t adopting the biggest standard — it’s building the smallest one that actually fits.
What’s your take? Do you use structured protocols for agent communication, or do your agents chat in natural language? Drop a comment below — I’d love to hear how others are solving this.
Wat vond je van dit bericht?