Back to Compendiums

A2A Protocol

A Lingua Franca for AI Agents, a look into the A2A specification

by matsjfunke

What and why A2A?

Specificaions like MCP or frameworks like the Vercel SDK have seen strong adoption because they offered ways to standardize LLM interactions. Switching your applications models or MCP Servers/tools is now a breeze, lowering the barrier to entry and encouraging more development in areas such as LLM tooling as builders can be sure integrating their tools will be less hassle for end users if all parties follow the spec.

That's where the A2A Protocol comes into play. It's google's apporoach to standardized agent to agent communication. As it provides a common language for agents built using diverse frameworks and by different vendors enabling interoperability.

So A2A aims to be the common langguage for agents allowing them to communicate / orchestrate one another.

How does it work?

A2A focuses only on the inputs and outputs of agents omitting agent internal details such as memory, tools, or resource access to align with standard client-server security paradigms, treating remote agents as standard HTTP-based enterprise applications.

Similar to MCP its build on top of HTTP and JSON-RPC (find more JSON-RPC details in my MCP Compendium).

In short you have a A2A Client (Client Agent) sending requests to a A2A Server (Server Agent) answering with a response, the servers can also send notifications for long runnning async operations and the responses can also be streamed in via SSE (Server-Sent Events) to receive real-time, incremental results or status updates.

Authentication

Like MCP A2A uses standard authorization protocols such as OAuth 2.0 to authenticate clients and servers. In practice OAuth tokens, API keys are typically passed through HTTP headers, separate from the A2A protocol messages.

Discovery / Agent Card

The most common ways for A2A Clients to discover A2A Servers is via a well-known URI like https://{agent-server-domain}/.well-known/agent-card.json which serves as a pointer to the Agent Card which is a json file defining the agent's identity, capabilities, endpoint, skills, and authentication requirements you can think of it like an Agents ID Card.

json
1// Example Agent Card
2{
3  "name": "Meal Planner Agent",
4  "description": "Plans a meals for the week based on the dietary preferences",
5  "url": "https://meal-planner-agent.com/a2a",
6  "version": "1.0.0",
7  "capabilities": {
8    "streaming": true,
9    "pushNotifications": true,
10    "stateTransitionHistory": false
11  },
12  "defaultInputModes": ["text", "text/plain"],
13  "defaultOutputModes": ["text", "text/plain"],
14  "skills": [
15    {
16      "id": "plan-meal",
17      "name": "Plan Meal",
18      "description": "Plans a meal for the week based on the dietary preferences"
19    },
20    {
21      "id": "generate-recipe",
22      "name": "Generate Recipe",
23      "description": "Generates a recipe for a meal based on available ingredients"
24    }
25  ],
26  "securitySchemes": {
27    "apiKeySecurityScheme": {
28      "in": "header",
29      "type": "apiKey",
30      "name": "X-API-Key",
31      "description": "API key authentication required"
32    }
33  },
34  "security": [
35    {
36      "apiKeySecurityScheme": []
37    }
38  ]
39}
40

Communication Types

When an agent receives a message from a client, it can respond in one of two fundamental ways

  1. Respond with a Stateless Message:
  • response used for immediate, self-contained interactions that conclude without requiring further state management (e.g. a simple question and answer).
  1. Initiate a Stateful Task:
  • response used for longer running interactions that require state management (e.g. a complex task that requires multiple steps and inputs).

Response Components

An Agent can respond with a Message or an Artifact where an Artifact is a tangible output generated by an agent during a task, composed of Parts which are the smallest unit of content within a Message or Artifact (for example, TextPart, FilePart, DataPart).

Terminology Overview

ElementDescriptionKey Purpose
A2A Clientagent that initiates requests to an A2A Server on behalf of a user or another systemInitiates requests
A2A Serveragent that exposes an A2A-compliant endpoint, processing tasks and providing responsesProcesses tasks and provides responses
Agent CardJSON file contianing all relevant information about agents capabilities, skills, endpoint, and authentication requirementsEnables clients to understand how to interact with them securely and effectively.
Taskstate of agent 2 agent conversation initiated by an agent, with a unique ID and defined lifecycle.Facilitates stateful multi-turn interactions and collaboration.
Messagesingle turn of communication between a client and an agentPrompt / Task, Answer / Result
Artifactoutput generated by an agent during a task (e.g. document, image, or structured data).Medium for concrete results of agent work, ensuring structured and referenceable outputs.
Partsmallest unit of returned result (e.g. TextPart, FilePart, DataPart)flexible medium for agent to express its results
Contextserver-generated identifier to group related tasks.Allows logical grouping of related tasks.
Extensionextending the A2A protocol with new data, requirements, RPC methods, and state machines defined in the agent cardDeclares additional functionality beyond the specification