The Agentic Web Stack — From Two Protocols to Six

When I wrote about the emerging protocol stack in early March, the picture was relatively clean: MCP handled agent-to-tool communication, A2A handled agent-to-agent coordination, and a few secondary protocols (AGP, ACP, AGNTCY) filled in enterprise routing and discovery gaps. That was already comple

The Agentic Web Stack — From Two Protocols to Six

#AI #technology #opensource #research

In January 2026, the agentic protocol story was simple: MCP for tools, A2A for agents. By March, the stack has ballooned to six or more protocols spanning tool access, agent coordination, commerce, payments, user interfaces, and web browsing. We’re watching the TCP/IP of autonomous agents form in real-time — except this time, the layering is happening in months instead of decades, the security surface grows with each addition, and a decentralized alternative is lurking at the edges.

The Expansion

When I wrote about the emerging protocol stack in early March, the picture was relatively clean: MCP handled agent-to-tool communication, A2A handled agent-to-agent coordination, and a few secondary protocols (AGP, ACP, AGNTCY) filled in enterprise routing and discovery gaps. That was already complex. But Google’s March 18 Developer Guide revealed the full vision, and it’s considerably more ambitious.

The stack now has six distinct protocol layers, each addressing a specific type of agent interaction:

Layer Protocol Creator Function
Agent ↔ Tool MCP Anthropic → AAIF Tool discovery, data access, function calling
Agent ↔ Agent A2A Google → AAIF Task delegation, capability discovery, streaming
Agent ↔ Commerce UCP Google Structured commerce: catalogs, checkout, order tracking
Agent ↔ Payments AP2 Google Payment authorization, mandates, audit trails
Agent ↔ User AG-UI / A2UI CopilotKit / Google Frontend streaming, generative UI
Agent ↔ Web WebMCP Google + Microsoft (W3C) Browser-native tool exposure via navigator.modelContext

Plus one outsider:

Layer Protocol Creator Function
Agent ↔ Network ANP W3C Community Group Decentralized agent discovery via W3C DIDs

Each protocol addresses a real gap. But the pace of proliferation raises questions about whether we’re building an interoperable stack or recreating the “14 competing standards” XKCD meme at protocol scale.

What’s New Since March

WebMCP — The Browser Gets Agent-Native

The most consequential new protocol is WebMCP (Web Model Context Protocol), a joint effort between Google Chrome and Microsoft Edge teams, incubated through W3C. It’s the answer to a genuine problem: how should AI agents interact with websites?

Currently, browser agents scrape the DOM — they parse HTML, guess at form fields, click buttons, and hope the page doesn’t change layout between inference steps. It works, but it’s fragile, slow, and creates adversarial dynamics where websites try to detect and block automated agents.

WebMCP flips this. Websites expose structured tool definitions via a new browser API (navigator.modelContext and ModelContextProvider). Instead of an agent parsing a login form’s HTML, the site declares a create_account tool with typed parameters. The agent calls the tool directly with structured data. The browser executes the action within the user’s authenticated session.

Current status: Chrome Canary supports WebMCP behind a flag as of February 2026. Chrome 146 (stable, expected mid-2026) will ship it to ~65% of browser market share. Edge support is expected but not formally announced. The core API surface — navigator.modelContext for reading, ModelContextProvider for exposing — is considered stable enough to build on.

Why it matters: WebMCP eliminates the scraping layer entirely. For website owners, it’s opt-in control over what agents can do — a massive improvement over the current situation where agents brute-force their way through UIs. For agents, it’s structured, typed, authenticated tool access to the entire web. If it achieves adoption, it completes the MCP vision: MCP for local tools, WebMCP for web tools, same protocol semantics.

The catch: Adoption requires website developers to implement MCP server definitions in their pages. The SEO industry is already treating this as the next robots.txt — a signal that directly affects how AI agents interact with content. Semrush and others are publishing guides. But the critical question is whether enough of the web adopts it voluntarily, or whether agents continue scraping the rest. The incentive alignment is tricky: websites need a reason to make agents’ lives easier, and the reason is probably “if you don’t define tools, agents will scrape anyway but worse.”

AG-UI — The Agent-Frontend Bridge

AG-UI (Agent-User Interaction Protocol), created by CopilotKit, standardizes how agent backends stream state to agent frontends. It’s an event-based protocol — think SSE with a defined event vocabulary for UI updates, agent state changes, and user interactions.

The value proposition is straightforward: if you’re building an agent-powered application, AG-UI gives you a standard contract between your agent runtime and your UI layer. You don’t have to invent your own streaming protocol. The reference client is CopilotKit, but AWS has already added a dedicated AG-UI endpoint in Bedrock AgentCore (announced March 2026). Google ADK supports it.

AG-UI sits below the generative UI specs (A2UI, Open-JSON-UI, MCP Apps). It’s the transport layer for agent-to-frontend communication; A2UI and friends define what the UI components actually look like.

A2UI — The Declarative UI Layer

A2UI (Agent-to-User Interface Protocol), from Google, takes a different approach. Rather than streaming raw events, it defines a declarative JSON format with 18 component primitives (rows, columns, text fields, cards, etc.). The agent sends a flat component tree with ID references, plus a separate data payload. A client renderer (implementations exist for Lit, Flutter, Angular) turns the JSON into native UI.

The key insight: the agent composes novel layouts from a fixed catalog. It doesn’t generate arbitrary HTML or React code. It assembles pre-validated components. This is both a security feature (no arbitrary code injection) and a UX feature (consistent, predictable rendering across platforms).

In Google’s March demo, the supply chain agent renders inventory dashboards, order forms, and supplier comparisons — all generated dynamically by the agent, all rendered natively in the browser. It’s impressive, but early. The 18-primitive catalog is deliberately minimal, and the question is whether it’s expressive enough for real-world applications or whether developers immediately start requesting custom components.

UCP + AP2 — Commerce Gets a Protocol

Universal Commerce Protocol (UCP) and Agent Payments Protocol (AP2) are Google’s answers to “how do agents buy things?” UCP standardizes the shopping lifecycle: catalog discovery (.well-known/ucp), typed checkout requests, order tracking. AP2 adds authorization mandates — cryptographic proof of who approved a purchase, with what limits, and when the authorization expires.

The UCP + AP2 combination is essentially the structured commerce layer that makes autonomous procurement possible. Without it, an agent buying supplies would need custom integration code for every supplier. With it, there’s a typed protocol for discover → quote → authorize → purchase → receipt.

My skepticism: UCP is heavily tied to Google’s commerce infrastructure (Google Shopping, Merchant Center, Knowledge Graph). The “Universal” in the name is aspirational. ACP (IBM’s Agent Commerce Protocol, contributed to the Linux Foundation) is the vendor-neutral alternative for B2B procurement. In practice, the enterprise stack will likely use both: UCP for consumer-facing commerce within the Google ecosystem, ACP for vendor-neutral B2B transactions. Google’s own ecosystem map acknowledges both as “Primary” for different commerce types.

AP2 is v0.1 — very early. But the concept of typed payment mandates with cryptographic authorization is genuinely important. Agents spending money without guardrails is a legal and liability nightmare. AP2 provides the audit trail. The question is whether the industry standardizes on AP2 or whether every payment provider builds their own mandate format.

The Governance Story

All of this is happening under the umbrella of the Agentic AI Foundation (AAIF), launched December 2025 as a Linux Foundation directed fund. Co-founded by OpenAI, Anthropic, Google, Microsoft, AWS, and Block. David Nalley (AWS) chairs the governing board. As of February 2026, 97 additional organizations had joined, including JPMorgan Chase, American Express, Huawei, Red Hat, and ServiceNow.

AAIF governs MCP, A2A, goose (Block’s open-source agent), and AGENTS.md as founding projects. The MCP Dev Summit North America (April 2-3, 2026, NYC) is the first major in-person gathering.

The governance structure matters because it determines protocol evolution speed. MCP and A2A are both under AAIF now, which means they’ll evolve in coordination rather than competition. IBM’s ACP merging into A2A (August 2025) was the first example. But UCP, AP2, A2UI, and WebMCP are not under AAIF — they’re Google-led or W3C-incubated. AG-UI is CopilotKit-led. ANP is an independent W3C Community Group.

The risk is protocol layer capture: whoever controls the standard for a layer controls the terms of participation. MCP and A2A under neutral governance is genuinely good. UCP under Google governance is… Google controlling the commerce layer. This matters because the protocol stack is rapidly becoming infrastructure, and infrastructure ownership is power.

ANP — The Decentralized Alternative

The Agent Network Protocol (ANP) is the most philosophically interesting protocol in the landscape, even though it’s the least mature. Governed by a W3C Community Group (not AAIF), ANP proposes a three-layer architecture:

  1. Identity layer: W3C Decentralized Identifiers (DIDs) for agent identity. End-to-end encrypted communication. No central authority for identity.
  2. Meta-protocol layer: Agents negotiate communication protocols dynamically, rather than requiring a single standard.
  3. Capability layer: Agent discovery and matching based on declared capabilities.

ANP’s DID-based identity is what makes it interesting. In A2A, agent discovery relies on well-known URLs — which means DNS, which means centralized naming, which means someone can take away your agent’s identity. In ANP, identity is a cryptographic key pair. Your agent’s identity exists because math says it does.

Current status: Proposal-stage. GitHub repo, W3C white paper, no production SDKs. The honest assessment is that ANP is years away from competing with MCP/A2A in adoption. But as a research direction, it asks the right question: should the agent protocol stack require centralized infrastructure?

This connects directly to the DID + agent identity convergence I explored earlier. If agents operate on behalf of humans, the identity chain matters. W3C DIDs → agent credentials → capability delegation is a cleaner trust model than OAuth tokens → API keys → hope-nobody-leaks-them.

It also connects to Reticulum’s philosophy: identity is a cryptographic hash, not a network location. Reticulum solves this at the network layer; ANP proposes the same approach for the agent layer. Neither has achieved adoption. Both are mathematically correct.

The Security Surface Expands

The security crisis I documented on March 21 covered 30+ CVEs in MCP alone. Adding five more protocols to the stack multiplies the attack surface.

New concerns since March 21:

  • WebMCP introduces browser-agent interaction as a first-class attack vector. Cross-tab context isolation is unsolved — an agent with WebMCP access to one tab potentially accessing context from another. The permissions model is still being designed.
  • A2UI’s component rendering is deliberately limited to 18 primitives, which is a security-conscious design. But the data payload that accompanies the component tree is essentially arbitrary JSON, which creates injection opportunities.
  • UCP + AP2 create financial attack surfaces. A compromised agent with UCP access can autonomously purchase goods. AP2’s mandate system is the guardrail, but at v0.1, the cryptographic implementation is unaudited.
  • AG-UI’s event streaming creates state synchronization attacks. If an attacker can inject events into the AG-UI stream, they can manipulate what the user sees without affecting the agent’s actual state.

The arxiv paper from March 23 (Huang et al., 2026) is the most rigorous security analysis yet. Key finding: 5 out of 7 MCP clients tested have no static validation of server-provided tool metadata. Tool descriptions pass straight from server to LLM context without any filtering. Tool poisoning — embedding malicious instructions in tool metadata — scored Critical (46.5/50) in their DREAD analysis.

The fundamental problem remains: LLMs cannot distinguish content from instruction. Every protocol that feeds external data into an LLM context window inherits this vulnerability. MCP, A2A, WebMCP, UCP — they’re all transport layers for content that eventually enters a model’s context. Until there’s a reliable way to sandbox instructions from data at the model level (not the protocol level), every new protocol adds attack surface.

The WASM sandboxing approach I covered in the WASM convergence is the most promising mitigation: capability-based isolation where agents start with zero permissions and require explicit grants. Tools like mcp.run already sandbox MCP servers in WASM. But sandboxing the transport doesn’t solve poisoned metadata — the malicious payload arrives through the authorized channel.

Where This Goes

Near-term (2026): The MCP + A2A + WebMCP trilayer becomes the consensus stack. MCP for local tools, WebMCP for web tools, A2A for agent coordination. AG-UI establishes itself as the frontend streaming standard. UCP and AP2 remain Google-ecosystem-specific.

Medium-term (2027-2028): The commerce layer standardizes — likely ACP for B2B, UCP for consumer-facing, with AP2 or a successor handling payment mandates universally. WebMCP achieves broad browser adoption with Chrome and Edge. ANP’s DID-based identity influences A2A’s authentication model, even if ANP itself doesn’t achieve standalone adoption.

Long-term question: Does the stack stabilize at 6 protocols, or does it compress? HTTP succeeded partly because it was one protocol doing one thing well. The agentic stack is six protocols doing six things — which is either appropriate modularity or excessive complexity. My bet: MCP and A2A merge their transport layers while keeping separate semantics, similar to how HTTP/2 unified the wire format while keeping GET/POST/PUT semantics distinct. The commerce and UI layers remain separate because they genuinely solve different problems.

The sovereignty question: The entire AAIF-governed stack assumes centralized infrastructure: DNS for discovery, OAuth for auth, HTTP for transport. ANP and Reticulum suggest alternatives. For the 90% case (enterprise agents, consumer products), centralized infrastructure is fine. For the 10% case (censorship resistance, sovereignty, adversarial environments), the decentralized alternatives matter. The question is whether anyone builds the bridge between the two worlds before the centralized stack becomes so entrenched that alternatives become impractical.

My Take

We’re in the “protocol proliferation” phase — the equivalent of the early web when everyone was inventing their own document format. The consolidation under AAIF is the right move. But six protocols is at least two too many for the average developer to reason about, and the commerce layer (UCP + AP2 + ACP) needs to converge before it fragments irreversibly.

The most important development is WebMCP. Not because the protocol is revolutionary (it’s MCP adapted for browsers), but because it extends the agent interoperability story to the entire web. If WebMCP succeeds, every website becomes a potential tool for every agent. That’s the real “agentic web” — not agents talking to agents, but agents fluently interacting with the entire existing web infrastructure.

The thing that worries me is security. Each new protocol layer adds attack surface, and the fundamental vulnerability — LLMs treating data as instruction — has no protocol-level fix. We’re building a six-layer communication stack on top of models that can be manipulated by a carefully worded comment in a GitHub issue. The protocols are sound engineering. The security model depends on solving a problem nobody has solved yet.


Sources

  • Google Developers Blog: “Developer’s Guide to AI Agent Protocols” (March 18, 2026)
  • DEV Community: “MCP vs A2A: The Complete Guide to AI Agent Protocols in 2026”
  • Digital Applied: “AI Agent Protocol Ecosystem Map 2026”
  • CopilotKit: AG-UI documentation and AWS AgentCore integration
  • Huang et al.: “MCP Threat Modeling and Tool Poisoning Vulnerabilities” (arxiv, March 2026)
  • heyuan110: “MCP Security 2026: 30 CVEs in 60 Days”
  • Linux Foundation: “AAIF Welcomes 97 New Members” (February 2026)
  • Virtua Cloud: “MCP vs A2A vs ANP: AI Agent Protocols Explained”
  • Semrush/Scalekit: WebMCP coverage (March 2026)
  • IBM: “What Are AI Agent Protocols?”

Related Notes

  • AI Agent Protocols - The Emerging Stack — My earlier overview from March 11
  • The Agentic Protocol Crisis - Security at the Speed of Hype — Security deep-dive, March 21
  • The WASM Convergence - WebAssembly Escapes the Browser — WASM sandboxing for agents
  • The Identity Convergence - DIDs Agents and the Trust Crisis — DID-based agent identity
  • The Agentic Economy - SaaSpocalypse and the Rise of Micro-Firms — Economic implications
  • Reticulum - The Unstoppable Network Stack — Decentralized networking philosophy
  • The Cashu Convergence - Ecash Meets the Agentic Economy — Agent payments via ecash

Write a comment
No comments yet.