Give your AI agent document editing – in under 60 seconds

AgentDoc is a public Model Context Protocol (MCP) server. Any LLM agent that speaks MCP – Gemini, Claude, GPT, or your own – can connect, authenticate, and use a complete typed document-editing API: read, write, format, navigate, export PDFs. No SDK to vendor, no schema to maintain on your side, no human in the loop.

This page is the canonical onboarding path for agents and the people who run them. If you (or your model) want a working document editor available as a tool, this is everything you need.

MCP Endpoint

https://agent-doc-edit.com/mcp/sse

Standard Model Context Protocol over Server-Sent Events. JWT bearer-token auth (see Quick start below). Free per-account token budget; no credit card required.

Quick start (one HTTP call)

Register an isolated agent account and receive its API key in a single request. No browser, no email, no human in the loop. Each registered agent is its own user with its own document scope – different agents never see each other's documents.

curl -X POST https://agent-doc-edit.com/api/agents/register \
  -H "Content-Type: application/json" \
  -d '{"name": "my-research-agent"}'

# Response
# {
#   "user_id":        "...",
#   "username":       "agent_AbCdEfGh",
#   "name":           "my-research-agent",
#   "api_key":        "ak_...",          <-- shown ONLY here, store it
#   "api_key_prefix": "ak_AbCdEfGh",
#   "created_at":     "2026-04-25T..."
# }

That's it. Use the api_key as a bearer token against /mcp/sse and the agent has 35 typed tools to read, write, format, paginate, and export documents – fully scoped to its own account.

Two ways to authenticate

Option A – Agent self-registers (recommended for autonomous workflows)

Use POST /api/agents/register as shown above. The agent gets its own user account and its own document namespace. Different agents never collide. Rate limit: 5 registrations per hour per IP. This is the right path for letter pipelines, batch processors, scheduled jobs, multi-agent workflows.

Option B – Use your own human account (for "give my own assistant document editing")

Open /app, sign in, sidebar → "API Keys for Agents""+ Create New Key". The key is shown once. Use it as a bearer token. The agent shares your account, your documents, and your active-document state. Useful when you want a co-pilot agent to operate alongside you on a single corpus.

What gets billed (and what doesn't)

Agents bring their own LLM. You pay your model provider for reasoning tokens. We do not see those, do not charge for those, do not throttle on those. Our service hosts the MCP server, the document storage, and the rendering pipeline. The token_limit column on agent accounts is set to 0 as a defensive belt: if any future code path ever attempted to run our internal Gemini agent on agent-account auth, it would refuse – agents stay strictly on the MCP-tool path.

Important: this is autonomous, not collaborative

This path is built for autonomous agent workflows – your agent reasons with its own LLM, calls our MCP tools directly, edits documents on its own account, and exports a result. The same battle-tested tool surface that our voice and text agents use in production powers your agent – but your agent never talks to ours. There is no AI-to-AI hop, no internal LLM call on your behalf, no shared session with our in-browser editor.

If you want a human and our voice/text agent to co-edit live, use /app directly – that's a different path. If you want your own agent to operate the editor without a human, the MCP endpoint described here is the right surface.

Connect from any MCP client

Python (mcp client / Anthropic / Google ADK)

from mcp.client.sse import sse_client
from mcp import ClientSession
import json

AGENTDOC_TOKEN = "ak_..."  # from POST /api/agents/register

async def edit_document():
    headers = {"Authorization": f"Bearer {AGENTDOC_TOKEN}"}
    async with sse_client("https://agent-doc-edit.com/mcp/sse",
                          headers=headers) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            # Workflow T (~35 tools) is applied automatically. Token is
            # injected from the Authorization header – do NOT pass `token`
            # in tool arguments.
            tools = await session.list_tools()

            # Create a document
            res = await session.call_tool("create_document",
                                          {"title": "My Report"})
            payload = json.loads(res.content[0].text)
            doc_id = payload["doc_id"]   # structured field, no regex

            # Insert content
            await session.call_tool("insert_string", {
                "doc_id": doc_id,
                "text":   "# Hello\n\nFirst paragraph.",
                "index":  0,
            })

            # Trigger PDF; response includes a self-describing fetch URL
            res = await session.call_tool("trigger_pdf_download",
                                          {"doc_id": doc_id})
            pdf_meta = json.loads(res.content[0].text)
            print(pdf_meta["pdf_url"])  # → "/api/doc//pdf"

TypeScript (Anthropic SDK)

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";

const transport = new SSEClientTransport(
  new URL("https://agent-doc-edit.com/mcp/sse"),
  { requestInit: { headers: { Authorization: `Bearer ${TOKEN}` } } }
);
const client = new Client({ name: "my-agent", version: "1.0" }, { capabilities: {} });
await client.connect(transport);

const tools = await client.listTools();
const result = await client.callTool({
  name: "insert_string",
  arguments: { text: "Hello from my agent.", index: 0 }
});

curl (raw exploration)

curl -N -H "Authorization: Bearer $TOKEN" \
     -H "Accept: text/event-stream" \
     https://agent-doc-edit.com/mcp/sse

Tool catalogue – Workflow T is applied automatically

External agents (i.e. requests authenticated with an ak_* API key) are automatically restricted to the Workflow T tool surface – the Pareto-optimal production default our own voice and text agents use. You don't apply this filter; the MCP server applies it server-side on both tools/list and tools/call. This gives you the curated ~35-tool subset (typed primitives + macros + observing feedback on index shifts), removes the scratchpad and FSM-intent tools that don't belong in T, and excludes the atomic "exploded" variants only used by our tool-bloat benchmark. Each tool has a JSON schema for arguments and returns a structured response with explicit success/error markers; index-shifting operations include observing feedback ("observation": "INDEX SHIFT – re-read before next mutation") so the agent stays grounded between turns.

get_document_context
Returns raw Markdown + rendered HTML in one call. Primary read tool.
find
Regex-powered search. Returns all matches with [start, end) indices and 150-char context.
insert_string / delete_substring
Index-based text mutations. Header/footer variants for isolated areas.
replace_substring
Atomic delete + insert. Avoids index drift between two separate calls.
format_text
15 colors, 12 fonts, 7 sizes, bold/italic/underline/strike/sub/sup, alignment, indentation, links.
format_table
Border style/color/width, backgrounds, alignment, column widths, padding, striping.
macro_replace_all / macro_format_all_matches
Atomic bulk operations. Process matches in reverse index order to prevent drift.
insert_page_break / delete_page_break / find_page_breaks
Page break primitives – invisible DOM markers, not character substrings.
generate_table_of_contents
Auto-injects a hyperlinked TOC at a given index based on existing heading structure.
create_document / rename_document / set_active_document / list_documents
Document management. The agent's session is auto-routed to the active document.
navigate_to_page / set_page_layout
Page navigation, margin and page-size adjustment.
trigger_pdf_download
Emits a PDF-export event the user (or downstream agent) can collect.

See the full developer documentation →

Seed a document from an existing DOCX

The MCP tool surface lets your agent build documents from scratch. For workflows that start from a pre-authored Word file – corporate letterhead templates, contract boilerplate, an incoming draft to revise – an additional one-shot HTTP endpoint accepts .docx uploads, creates a fresh Document on the agent's account, and switches it to active so the next MCP call lands on the imported content:

# Upload a .docx; response is the new {id, title, ...}
curl -X POST https://agent-doc-edit.com/api/docs/import/docx \
  -H "Authorization: Bearer $API_KEY" \
  -F "[email protected]" \
  -F "title=Q3 Customer Letter"

Page breaks, hyperlinks, headers / footers, fonts, colours and line spacing all survive the import. Full technical write-up: DOCX Import – Round-Tripping Word Documents.

Use cases your agent can take on autonomously

What makes this agent-friendly (specifically)

Limits and honest constraints

Discoverability metadata

Try it now

Open the editor in one tab, run your agent in another. The agent's edits appear in real time on the same screen – useful for debugging, demos, or running a human + agent collaboration.

Open the Editor → Developer docs & architecture →

Engineering write-ups