Skip to main content

What is an Agent?

An Agent wraps an LLM with a system prompt, temperature, a set of tools, and optional knowledge bases. When you send a message, the agent enters a ReAct (Reason + Act) loop: the LLM decides whether it needs to call a tool, executes the tool if needed, observes the result, and either calls another tool or generates a final response. This loop continues for up to 25 steps (configurable per run). On the final step, tools are withheld so the LLM is forced to produce a text response.

The ReAct Loop

ReAct loop diagram: User Message → LLM (Reason) → Tool Call? If yes → Execute Tool → Tool Result → back to LLM. If no → Generate Response → Stream to User.
Each iteration of the loop is a step. Within a step, the agent can call multiple tools — concurrency-safe tools (like knowledge base search) run in parallel via a thread pool, while other tools execute sequentially. If the agent makes the same tool call with identical arguments three times in a row, the platform detects a doom loop and terminates the run. During streaming, each step is visible as SSE events: step_started, tool_call, tool_result, step_completed, and chunk events for the final response.

Context Management

The agent automatically manages its context window. Before each LLM call, it estimates the token count and, if nearing the model’s context limit, triggers proactive compaction: first pruning old tool results (replacing them with placeholders while keeping the last 3 user turns), then if still over the limit, summarizing the older conversation using a lightweight LLM (gpt-4.1-nano). If the LLM returns a prompt_too_long error, the agent retries with compaction. If the LLM output is truncated, the agent injects a “continue where you left off” message and retries up to 3 times.

Tool Types

Agents can use three categories of tools: builtin tools provided by the platform, custom tools that call your HTTP endpoints, and MCP server tools discovered at runtime.

Builtin Tools

The platform provides eight builtin tools. Assign them to an agent by name via POST /api/agents/{id}/tools. The database tools include schema-level access control — when assigning them, you configure which schemas and tables the agent can access.
ToolDescriptionConstraints
database_queryExecute read-only SQL SELECT queries against the project databaseMust start with SELECT. Multi-statement blocked (no semicolons). Results capped at 50,000 chars. Rollback after read.
database_writeExecute INSERT, UPDATE, and DELETE operationsUPDATE requires a WHERE clause (no mass updates). DELETE requires a WHERE clause (no mass deletes). All identifiers validated against injection.
http_requestMake HTTP requests to external APIsResponse capped at 10,000 chars. 30-second timeout. No SSRF validation — see warning below.
code_executeRun Python or JavaScript in a sandboxed environmentDelegated to external sandbox service. Default 30-second timeout, configurable per call.
storage_readList and download files from project storage bucketsBinary files return a signed URL instead of content. Supports list (with prefix/limit/offset) and download operations.
storage_writeUpload text content to project storage bucketsUTF-8 text content only. Returns path and file size.
web_searchSearch the web using Exa.ai with neural/keyword modes, domain filters, date ranges, and content depth controlResults capped at 20K-50K chars by content mode. 1-10 results. Requires EXA_API_KEY.
web_scrapeExtract content from web pages as clean markdown, with optional AI vision image analysisResults capped at 200K chars. Supports markdown/HTML/links formats. include_images uses gpt-4.1-mini vision. Direct image URLs bypass Firecrawl. Requires FIRECRAWL_API_KEY.
http_request has no SSRF protection. Unlike Custom Tools (below), the builtin http_request tool calls the URL the agent supplies directly, with no allow-list and no validation against internal addresses. An agent given http_request can reach RFC1918 ranges, localhost, and cloud metadata endpoints (e.g. 169.254.169.254). Only enable http_request for agents whose model + prompt you trust, or use a Custom Tool with a fixed endpoint instead — Custom Tools enforce SSRF via validate_url on the endpoint URL.

Tool prerequisites

Some builtin tools need an API key or sandbox configured before they’ll work. If the prerequisite is missing, the tool returns a JSON error message to the agent (e.g. "Exa API key not configured. Set it in Settings > Tools.") rather than throwing.
ToolWhat it needsWhere to set it
database_query / database_writeNo setup. Uses the project’s superuser connection automatically.
http_requestNo setup.
code_executeA code sandbox service (CODE_SANDBOX_URL, optional CODE_SANDBOX_API_KEY)Platform-level env var — set by the platform team for managed cloud; self-host operators configure their own sandbox. Without it the tool returns "Code sandbox is not configured".
storage_read / storage_writeNo setup. Uses the project’s Storage service.
web_searchEXA_API_KEY settingStudio → Settings → Tools, or PUT /api/settings/EXA_API_KEY
web_scrapeFIRECRAWL_API_KEY setting (and optional VISION_MODEL if include_images: true)Studio → Settings → Tools, or PUT /api/settings/FIRECRAWL_API_KEY. VISION_MODEL defaults to gpt-4.1-mini.
If you’re enabling these via API rather than the Studio, set them with PUT /api/settings/{key} — see Settings API.

Custom Tools

Custom tools call your own HTTP endpoints. You define the tool with a name, description, JSON Schema for inputs, an endpoint URL, HTTP method, and optional headers. When the agent decides to use the tool, the platform POSTs the tool arguments as JSON to your endpoint and returns the response to the agent. Responses are capped at 10,000 characters. Timeout is 30 seconds. SSRF validation prevents the agent from calling internal network addresses.

MCP Servers

MCP (Model Context Protocol) servers let you connect agents to external tool providers. Add an MCP server URL to your agent, and at the start of each run the platform sends a tools/list JSON-RPC request to discover available tools. Discovered tools are namespaced (mcp__{server_name}__{tool_name}) and added to the agent’s tool set alongside builtin and custom tools. Tool calls are executed via tools/call JSON-RPC requests. Timeout is 30 seconds for both discovery and per tool call. Transport: the platform’s MCP client speaks JSON-RPC over HTTP POST to the server URL. (The DB schema accepts a transport field with default http; other values like sse are stored but not consumed by the current client.) Use a server that exposes JSON-RPC over HTTP — most MCP server frameworks do this by default. Headers: any headers you configure on the server (typically for auth) are sent on both tools/list and tools/call requests. There’s no per-call header injection. enabled flag: if you set enabled: false on an MCP server, discovery skips it entirely — its tools aren’t added and no tools/call requests fire. Use this to temporarily disconnect a server without removing it. Failure handling: if tools/list fails (network error, non-200, JSON-RPC error), the agent run continues without that server’s tools (fail-open). The failure is logged but not surfaced as a run error. Tool annotations: the platform reads annotations.readOnlyHint, destructiveHint, and openWorldHint from each discovered tool and uses them for concurrency / safety planning. A readOnlyHint: true tool may be called in parallel with other read-only tools.

Knowledge Base Search Tool

When you link a knowledge base to an agent, the platform automatically creates a knowledge_search tool. The agent can call this tool with a natural language query to search the KB using whatever retrieval strategy is configured (vector, hybrid, full-text, or tree search). If multiple KBs are linked, a single knowledge_search tool is created with a knowledge_base_names filter parameter so the agent can target specific KBs. The search tool is concurrency-safe and read-only, so it runs in parallel with other safe tools.

Sessions & Memory

Every agent conversation happens within a session. A session stores a sequence of runs, each containing the user input, assistant response, tool calls, tool results, and usage statistics. When you pass a session_id with a new message, the platform loads all completed runs from that session and reconstructs the full message history for the LLM. Sessions persist until explicitly deleted, enabling long-running multi-turn conversations. New sessions are created automatically if no session_id is provided — the start SSE event returns the generated session_id for future use.

Hooks & Middleware

Hooks let you intercept agent execution at key points with three types of middleware: HTTP webhooks, rule-based policies, and human approval gates. Each hook is configured with an event (when it fires), a type (what it does), an optional matcher (which tool to target), and a config object.
EventWhen It FiresCan Block?Can Modify?
OnRunStartBefore the agent begins processingYes (blocks entire run)No
PreToolUseBefore each tool executionYes (returns error to LLM)Yes (can replace tool arguments)
PostToolUseAfter each tool executionNoYes (can replace tool result)
PreResponseAfter the ReAct loop, before returning contentYes (replaces response with blocked message)Yes (can modify final content)
OnRunCompleteAfter successful completion (fire-and-forget)NoNo
Hook TypeBehavior
httpPOSTs event data to a webhook URL. Response can allow, deny, or modify. Fail-open on error (non-200 or timeout). Default 5-second timeout.
ruleEvaluates conditions against tool arguments. Supports operators: CONTAINS, STARTS_WITH, MATCHES (regex), IN. First matching deny rule wins.
approvalPauses execution and emits an approval_requested SSE event. Blocks until the approve endpoint is called or 300-second timeout expires.

Human-in-the-Loop Approval

The approval flow is implemented as a PreToolUse hook of type approval. When the agent tries to call a matching tool, execution pauses: the SSE stream emits an approval_requested event with the tool name and arguments, then the thread blocks waiting for a decision. Your application calls POST /api/agents/runs/{run_id}/approve with {approved: true} or {approved: false}. On approval, the tool executes normally and the stream resumes. On rejection, the tool call is skipped and the agent receives a denial message so it can choose an alternative approach. If no decision arrives within 300 seconds (configurable), the tool call is blocked with a timeout error.
Scope approval to specific toolsSet the hook’s matcher field to a specific tool name (e.g., “database_write”) to only require approval for that tool. Omit matcher to require approval for all tool calls.

Streaming

The streaming endpoint (POST /api/agents/{id}/run/stream) returns Server-Sent Events for the entire run lifecycle. When tools are present, the agent runs the full ReAct loop and events are emitted as they occur. When no tools are assigned, the agent streams LLM tokens directly as chunk events. If the client disconnects mid-stream, the platform detects the GeneratorExit, sets the abort signal to stop the agent, and spawns a background thread to persist the partial run.

Sync (Non-Streaming) Run

The sync endpoint (POST /api/agents/{id}/run) runs the agent without streaming and returns the full response as JSON. In this mode, no tools are loaded and no ReAct loop runs — the agent makes a single LLM call with the conversation context. This is useful for simple question-answering where tool use isn’t needed.

Limits & Safeguards

ConstraintDefaultNotes
Max ReAct steps25Configurable per run. On the final step, tools are withheld to force a text response.
Doom loop detection3 identical callsIf the last 3 tool calls have the same name and arguments, the run fails.
Output truncation recovery3 retriesIf the LLM output is truncated, the agent retries with a continuation prompt.
Context compaction failures3 attemptsAfter 3 failed compaction attempts, the agent stops trying to compact.
LLM retries3Automatic retry on transient LLM errors.
Custom/MCP tool timeout30 secondsPer-tool execution timeout.
Tool result truncation50,000 charsResults beyond this are truncated. KB search and delegation have no limit.
Approval timeout300 secondsConfigurable via hook config. Blocks tool execution until decision arrives.
Max orchestration depth3Prevents infinite recursive delegation between agents.

Next Steps

Build an Agent

Create an agent, assign tools, and start chatting.

Advanced Agent Config

Add MCP servers, hooks, and approval flows.

Multi-Agent Orchestration

Coordinate multiple agents for complex tasks.

Agents API Reference

Full endpoint documentation.