Anthropic built the Managed Agents API for Claude. We made it work with everything else.
Anthropic launched Claude Managed Agents. Define an agent, give it an environment, start a session. The agent loop, sandboxed containers, tool execution, context management, and event streaming are all handled for you.
We read the HN thread the night it dropped. The same concern kept surfacing in different words: the spec is great, the engineering is great — but nobody wanted to bet their production stack on a single vendor's roadmap, pricing page, and uptime SLA.
So we built the version that doesn't ask you to.
The HN discussion on Managed Agents raised real concerns. We built AgentStep to answer every one.
“Vendor lock-in”
5 harnesses. Switch with one config change.
“Can’t mix models”
Claude to plan, Codex to execute, Gemini for docs. Same API.
“Code leaves my machine”
Self-hosted. Your infra, your data, your kill switch.
“$0.08/session-hour adds up”
The Claude backend works via Claude Code — uses whatever auth you already have.
“Framework churn”
Open source. Fork it. Extend it. Apache 2.0.
“Misaligned incentives”
We don’t sell tokens. You bring your own keys.
Claude Code, Codex CLI, Gemini CLI, OpenCode, and Factory are all coding agent harnesses. Each one has its own output format, session model, and way of handling tool calls.
AgentStep wraps all of them under a single Managed Agents API. One surface to orchestrate, monitor, and manage agents — regardless of which harness does the work.
| Harness | Models |
|---|---|
| Claude | Sonnet 4.6, Opus 4.6, Haiku 4.5 |
| Codex | GPT-5.4, GPT-5.4 Mini |
| Gemini | Gemini 3.1 Pro, 3, 2.5 Pro/Flash |
| OpenCode | Any model via provider prefix |
| Factory | Multi-model (Claude, GPT, Gemini, GLM, Kimi) |
It's pluggable. Adding a new harness is one TypeScript interface — about 200 lines. We'll keep adding more.
The harness that harnesses other harnesses.
Managed Agents runs containers for you. AgentStep lets you choose.
| Provider | Cold Start |
|---|---|
| Docker | ~3s |
| Podman | ~3s |
| Apple Container | ~1s |
| Sprites | ~2s |
| E2B | ~1s |
| Vercel Sandbox | <1s |
| Fly.io | ~3s |
| Daytona | ~5s |
| Modal | ~2s |
Web UI at localhost:4000. API docs at /v1/docs. API key auto-generated.
Claude Code checks prerequisites, configures secrets, starts the server, and runs your first session.


AgentStep implements the managed-agents-2026-04-01 spec end-to-end. The official Anthropic SDK works unchanged — just point it at your server.
Same endpoints. Same session model. Same SSE streaming. Different baseURL.
Outcomes — Rubric-based evaluation with iterative improvement
Multi-agent threads — Agents spawning agents, coordinated work
Memory — Persistent stores that survive across sessions
5 harnesses. 10 sandboxes. Self-hosted. Open-source. From npm install to running agents in 2 minutes.