#015: Developing Open Agent API: The Standardized Communication Solution For Agents And Agent Teams
Introduction
The Open Agent API is a comprehensive communication solution for connecting agents together in a standardized way. As AI capabilities continue to evolve, there's a growing need for standardized interfaces that allow developers to harness these capabilities across various applications. The Open Agent API addresses this need by providing a robust, extensible framework for building agent-based systems.
This post explores the problems the Open Agent API aims to solve, its design principles, and how developers can leverage its capabilities for building intelligent applications that can all work together. We'll examine how the API defines the protocol for the creation of individual agents, teams of agents, and the tools they can use to accomplish tasks.
Problems Addressed by Open Agent API
Before diving into the technical details, let's consider the challenges that the Open Agent API seeks to address:
Standardization: With multiple AI providers and agent frameworks, there's a need for standardizing how we interact with agents across the internet.
Composability: Building complex AI systems requires the ability to interconnect systems in standardized ways. Many types of agents need to connect together.
Authentication and Access Control: Since it is important to build multi-user systems that make agents available to the public, it must define standard ways of authenticating and gaining access to resources.
Tool Integration: Enabling agents to use tools and services that are both on the server, on the client and on other servers.
Team Collaboration: Allowing multiple agents to work together on complex tasks.
Billing and Resource Management: Since agents almost always incur costs, there must be an efficient way to handle subscriptions.
The Open Agent API provides solutions for these challenges through a well-structured and well-defined REST API specification.
Purpose of the Specification
The Open Agent API specification is designed for services that aim to provide AI agents to a wide audience. Similar to how companies like OpenAI and Anthropic offer APIs to access their language models, this specification creates a standardized interface for agent interactions.
API Overview
The Open Agent API is divided into several core components:
Authentication: Secure user authentication via magic links, OAuth, and API keys
Agents: Creation and management of individual AI agents
Teams: Formation and operation of agent teams
Chat: Low-level chat completions (compatible with OpenAI)
Knowledge: Semantic knowledge base search
MCP (Model Context Protocol): Tool sharing mechanism between agents
Billing: Optional integration with payment systems
Let's explore each of these components in detail.
Authentication
The authentication system is the gateway to the Open Agent API. It supports multiple authentication methods to accommodate different use cases:
JWT Token
JWT (JSON Web Token) serves as the primary account-level authentication mechanism. When a user logs in through magic links or OAuth, they receive a JWT token that:
Provides full access to account management features
Allows creating, modifying, and deleting resources
Enables subscription management and billing operations
Has a limited lifespan and requires periodic renewal
Using the JWT token the user is then able to generate multiple API keys for different applications. JWT token acts like the main "login" token.
Magic Link Authentication
For simple email-based authentication:
POST /auth/link
{
"email": "user@email.com",
"base_url": "http://url"
}
This endpoint accepts an email address and sends a magic link for passwordless authentication.
This returns a JWT token as part of the AuthResponse
message. This token is unverified so it can not yet be used until the user verifies their identity.
After clicking the link, users can verify their identity:
GET /auth/link/verify/{token}
At his point the token becomes usable and can be used with account API calls.
OAuth Authentication
For integrating with existing identity providers:
GET /auth/oauth/{provider}
This initiates an OAuth flow with providers like Google or GitHub. After successful authentication, the API returns a JWT token via:
GET /auth/oauth/{provider}/callback
For console applications, a device flow is also supported, allowing authentication without a web browser:
POST /auth/oauth/device/token
API Key Management
API keys are tokens that only have permissions to access agent API. They are meant to be created for each application that needs to access the agent api.
POST /auth/keys
GET /auth/keys
GET /auth/keys/{key_id}
DELETE /auth/keys/{key_id}
These endpoints allow creating, listing, and revoking API keys. Each key can have customized permissions and metadata.
Creating and Managing Agents
The Agents component is at the heart of the Open Agent API. It allows creating, configuring, and interacting with AI agents.
Creating an Agent
POST /agents
When creating an agent, developers specify:
A name and description for the agent
The base model (e.g., "claude-3-7-sonnet-latest")
Instructions as (additional) instructions for the agent's behavior
Tools that the agent can use
Example agent creation request:
{
"name": "Research Assistant",
"model": "claude-3-7-sonnet-latest",
"description": "An agent that helps with research tasks",
"instructions": "You are a research assistant that helps users find and summarize information.",
"tools": [
{
"name": "web_search",
"instructions": "Search the web for information",
"functions": [
{
"name": "search",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
]
}
]
}
Managing Agents
List all your agents:
GET /agents
Get details for a specific agent:
GET /agents/{agentId}
Update an agent's configuration:
PUT /agents/{agentId}
Delete an agent:
DELETE /agents/{agentId}
Interacting with Agents
The API provides a powerful interface for interacting with agents:
POST /agents/{agentId}/run
This endpoint accepts a series of messages and optional tools, and returns the agent's response. For example:
{
"messages": [
{"role": "user", "content": "Can you research the latest developments in quantum computing?"}
],
"tools": [
{
"name": "web_search",
"functions": [
{
"name": "search",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
]
}
]
}
The API supports both synchronous responses and streaming via Server-Sent Events (SSE), allowing real-time updates as the agent thinks and uses tools.
Team Collaboration
This collection of interfaces gitve user the ability to create teams of agents that can collaborate on tasks.
Creating a Team
POST /teams
When creating a team, developers specify:
A name for the team
The member agents and their roles
Tools available to the team manager
Example team creation:
{
"name": "Research Team",
"members": [
{
"agent_id": "agent1",
"role": "Researcher"
},
{
"agent_id": "agent2",
"role": "Writer"
}
],
"tools": [
{
"name": "knowledge_base",
"instructions": "Search the knowledge base",
"functions": [
{
"name": "search",
"description": "Search the knowledge base",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
]
}
]
}
Managing Teams
List all your teams:
GET /teams
Get details for a specific team:
GET /teams/{teamId}
Running Team Interactions
POST /teams/{teamId}/run
This endpoint accepts messages and coordinates responses among team members. For complex tasks, the team leader agent routes subtasks to appropriate member agents based on their roles and capabilities.
Model Context Protocol (MCP)
The MCP component enables tools to be shared between agents using a standardized protocol. This allows agents to expose capabilities as tools that other agents can use. This also allows any tooling provided by the agent service to be easily integrated with desktop agents like cursor.
GET /mcp/
POST /mcp/messages
GET /mcp/sse
MCP provides a way for agents to communicate with external tools and interface with third party services.
Tools Integration
The Tools component allows registering, discovering, and using tools with agents.
Discovering Tools
GET /tools
This endpoint returns a list of available toolkits that can be used by agents.
Getting Tool Details
GET /tools/{tool_id}
Returns detailed information about a specific tool, including its parameters and usage instructions.
POST /tools
Creates a new tool definition that can be used by agents. For example, this allows the user to expose an external MCP service to their agent or team and to be able to configure it in one place without having to pass all tool parameters with every request to the agent.
Knowledge Base Integration
The Knowledge component provides semantic search capabilities for agents:
POST /knowledge/search
This allows agents to query a knowledge base for relevant information, enhancing their ability to provide accurate and contextual responses.
Subscription and Billing
For commercial implementations, the API includes subscription management:
GET /subscription/plans
POST /subscription/subscribe
GET /subscription/current
POST /subscription/cancel
These endpoints allow users to manage their subscriptions, including viewing available plans, subscribing to a plan, and canceling subscriptions.
API Key Usage Analytics
For monitoring and billing purposes, the API provides detailed usage analytics:
GET /auth/keys/{key_id}/usage
This returns information about API key usage, including request counts, token usage, and billing information.
Practical Applications
The Open Agent API enables a wide range of applications:
Enterprise Assistant Systems: Create specialized agents for different departments that can collaborate on complex tasks.
Research Tools: Build agents that can search, summarize, and analyze information from various sources.
Customer Service: Deploy teams of agents to handle customer inquiries, with specialists for different types of questions.
Development Assistants: Create agents that can write code, debug issues, and collaborate on software development.
Educational Platforms: Build tutoring systems with agents specialized in different subjects.
Closing thoughts
The Open Agent API provides a comprehensive framework for building agent-based systems. By standardizing the interfaces for authentication, agent creation, team collaboration, and tool integration, it enables developers to build powerful AI applications that can solve complex problems.
Let's build this interface together and define exactly what we need and how we want it to work to facilitate simple and easy collaboration between agents. Whether you're interested in implementing new features, improving documentation, or sharing your use cases, your input is valuable.
This specification is going to evolve. You have the chance to influence the future! Create a discussion on the github project page: