agentstep

One API. Any Agent.
Anywhere.

Anthropic built the Managed Agents API for Claude. We made it work with everything else.

5Harnesses·11Sandboxes·2 minQuickstart·Apache 2.0License

Why we built this.

Anthropic launched Claude Managed Agents. Define an agent, give it an environment, start a session. The agent loop, sandboxed containers, tool execution, context management, and event streaming are all handled for you.

We read the HN thread the night it dropped. The same concern kept surfacing in different words: the spec is great, the engineering is great — but nobody wanted to bet their production stack on a single vendor's roadmap, pricing page, and uptime SLA.

So we built the version that doesn't ask you to.

Every objection, addressed.

The HN discussion on Managed Agents raised real concerns. We built AgentStep to answer every one.

✓

“Vendor lock-in”

5 harnesses. Switch with one config change.

✓

“Can’t mix models”

Claude to plan, Codex to execute, Gemini for docs. Same API.

✓

“Code leaves my machine”

Self-hosted. Your infra, your data, your kill switch.

✓

“Runaway agent bills”

BYO subscription. $0/session. No surprise invoices.

✓

“Framework churn”

Open source. Fork it. Extend it. Apache 2.0.

✓

“Misaligned incentives”

We don’t sell tokens. You bring your own keys.

We don't compete with the harnesses. We unify them.

Claude Code, Codex CLI, Gemini CLI, OpenCode, and Factory are all coding agent harnesses. Each one has its own output format, session model, and way of handling tool calls.

AgentStep wraps all of them under a single Managed Agents API. One surface to orchestrate, monitor, and manage agents — regardless of which harness does the work.

Harness	CLI	Models
Claude	claude -p	Sonnet 4.6, Opus 4.6, Haiku 4.5
Codex	codex exec	GPT-5.4, GPT-5.4 Mini
Gemini	gemini -p	Gemini 3.1 Pro, 3, 2.5 Pro/Flash
OpenCode	opencode run	Any model via provider prefix
Factory	droid exec	Multi-model (Claude, GPT, Gemini, GLM, Kimi)

It's pluggable. Adding a new harness is one TypeScript interface — about 200 lines. We'll keep adding more.

The harness that harnesses other harnesses.

Pick your sandbox.

Managed Agents runs containers for you. AgentStep lets you choose.

Provider	Cold Start
Local
Docker	~1.5s
Podman	~1s
Apple Container	~1s
Apple Firecracker	~0.3s
Cloud
Daytona	~0.1s
E2B	~0.5s
Vercel Sandbox	~0.4s
Modal	~1.7s
Sprites	~1s
Fly.io	~3s
Anthropic	~2s

New: Apple Firecracker — hardware-isolated microVM sandboxes on Mac. Sub-second cold starts.

Two minutes to a running agent.

Option 1: npm

npx @agentstep/gateway serve

Web UI at localhost:4000. API docs at /v1/docs. API key auto-generated.

Option 2: Docker

docker run -p 4000:4000 \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  ghcr.io/agentstep/gateway

Option 3: Claude Code

git clone https://github.com/agentstep/gateway.git
cd gateway
claude
> /setup-gateway

Claude Code checks prerequisites, configures secrets, starts the server, and runs your first session.

Drop-in replacement.

AgentStep implements the managed-agents-2026-04-01 spec end-to-end. The official Anthropic SDK works unchanged — just point it at your server.

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "http://localhost:4000",  // ← that's it
  apiKey: "ck_...",
});

const agent = await client.beta.agents.create({
  name: "my-agent",
  model: "claude-sonnet-4-6",
  instructions: "You are a helpful coding assistant.",
});

Same endpoints. Same session model. Same SSE streaming. Different baseURL.

Coming soon.

Outcomes — Rubric-based evaluation with iterative improvement

Multi-agent threads — Agents spawning agents, coordinated work

Memory — Persistent stores that survive across sessions

One API. Any agent.
Your infrastructure.

5 harnesses. 11 sandboxes. Self-hosted. Open-source. From npm install to running agents in 2 minutes.

View on GitHub

One API. Any Agent.Anywhere.