0
Blog

Run OpenCode Zen or Go plan models with the Claude Code CLI

I lost my free OpenCode setup overnight, bought the Go plan, and watched Claude Code throw 401s for two hours. Here is the proxy that fixed it, the exact config I shipped, and the gotchas nobody writes down.

12 min readShravan Bhati
Claude CodeOpenCodeOpenCode ZenOpenCode GoCLIDevBlog
Run OpenCode Zen or Go plan models with the Claude Code CLI

For last few days, my daily driver inside the Claude Code CLI was the MiniMax-m3 free tier. Cheap, fast enough, and the integration was a five-line .env change. I had it wired into my zsh dotfiles. I forgot it was free, which is the most dangerous thing a free tool can do to you.

Then one morning the cursor just hung. Every prompt sat there blinking, and the CLI eventually spat a 401 back at me. I checked the dashboard. The free offer had quietly ended.

The fix looked obvious. Buy the OpenCode Go plan. Five dollars a month, then ten, for access to the same family of open coding models I was already using. I paid before I finished reading the page. Then I went back to my terminal, pasted the new key into ANTHROPIC_AUTH_TOKEN, and ran claude.

It threw a 401. Then it threw another. Then it sat in a retry loop for two full minutes before giving up.

I spent the next two hours digging through Discord threads, half-written GitHub issues, and one extremely angry Reddit comment before I figured out what was actually wrong. The TL;DR: Claude Code speaks the Anthropic Messages API, and the cheap OpenCode endpoint does not. You need a proxy in between that translates one to the other. The proxy I ended up using is a small Go binary called oc-go-cc, and once it was running everything worked on the first try.

This post is the version of that two-hour rabbit hole I wish someone had handed me before I started. I'll walk through what's actually happening under the hood, how to set up oc-go-cc end to end, the config I'm running today, and the three failure modes that ate most of my afternoon.

Why my old setup stopped working

The free MiniMax-m3 tier I was on terminated requests at a Claude-compatible endpoint. The CLI would hit https://...anthropic.com/v1/messages (or the override I had pointed it at), the upstream would accept the Anthropic-shaped JSON body verbatim, stream a response back in the same shape, and Claude Code was happy. No translation. No proxy. Just a base URL swap and a token.

The paid Go plan does not work that way. The Go endpoint exposes an OpenAI-style chat completions surface at /zen/go/v1/chat/completions. The body shape is different. The streaming format is different. Tool calls go through tool_calls instead of tool_use blocks. Claude Code sends an Anthropic request, the Go endpoint reads it as malformed, and you get a 401 (or a 400, depending on how the gateway feels that day) followed by the CLI's exponential retry kicking in.

OpenCode Zen does have an Anthropic-format endpoint at /zen/v1/messages, but it bills pay-as-you-go per token, and the model menu only overlaps partially with the Go plan I had just paid for. If you bought Go specifically because the flat fee was the appeal, pointing Claude Code at Zen defeats the whole point.

So the gap is real, and it is the kind of gap a small piece of glue code can close.

The fix: oc-go-cc as a translation layer

samueltuyizere/oc-go-cc is a Go binary that runs as a local HTTP server on port 3456. Claude Code talks to it in Anthropic Messages format. The proxy rewrites each request into whatever the upstream actually wants (OpenAI Chat Completions for Go, the Responses API for GPT models on Zen, the Gemini format for Google models on Zen), forwards it, then rewrites the streaming response back to the Anthropic SSE shape the CLI expects. From the CLI's perspective, nothing changed. It still thinks it's talking to Anthropic.

A few things make it nice to live with:

  • It routes by context. Background tasks like grep or read file go to a cheap model. Long prompts (more than 80K tokens) get routed to a model with the 1M-token context window. Anything with "think" or "plan" or "architect" in the system prompt goes to a reasoning model. You write the config once and stop thinking about model selection.
  • It has fallback chains. If the primary model fails, the proxy tries the next one in the chain before surfacing an error to the CLI. A circuit breaker skips an unhealthy model for 30 seconds before retrying.
  • It handles both Go and Zen. You can keep Go as the default and override individual model IDs (say, claude-sonnet-4.5) to route through Zen instead, in the same config file.

The whole thing is one binary, one JSON file, and a couple of env vars. Once it's set up, the only command you run is claude, exactly like before.

Installation

The install is the boring part. Pick the one that matches your OS:

macOS
bash
brew tap samueltuyizere/tap && brew install oc-go-cc
Windows (scoop)
bash
scoop bucket add oc-go-cc https://github.com/samueltuyizere/scoop-bucket && scoop install oc-go-cc

For Linux (and anywhere Docker runs), the cleanest path is the container. Clone the repo, copy the example env file, and run it as a detached service:

shell.sh
bash
cp .env.example .env
docker build -t oc-go-cc .
docker run -d --restart unless-stopped --name oc-go-cc \
  --env-file .env -p 3456:3456 oc-go-cc

The --restart unless-stopped flag is the one I forgot the first time. If your machine reboots and the container doesn't come back up, Claude Code will fail with "connection refused" and you'll spend ten minutes blaming the proxy when it just isn't running.

If you're not already on Docker, there's a tiny Makefile in the repo that wraps the same commands behind make docker-up and make docker-down.

Generate the API key

Log into your OpenCode dashboard, find the API keys section, and create one. The key starts with sk-opencode-. Copy it once, paste it somewhere safe, you won't see it again.

If you bought the Go plan, you get a single key that works for both Go endpoints and Zen endpoints. You don't need two separate keys for the two providers. The proxy uses the same one for both.

Configure the proxy

The proxy reads from ~/.config/oc-go-cc/config.json. Generate the default with:

shell.sh
bash
oc-go-cc init

Then export your key. The two options are an env var or putting it in the config directly. I prefer the env var because it keeps the secret out of the file:

shell.sh
bash
export OC_GO_CC_API_KEY=sk-opencode-your-key-here

If you put it in your shell profile (~/.zshrc, ~/.bashrc), make sure to source the file before you start the proxy. I have lost more than one evening to a key that was only set in the current terminal.

This is the config I'm running today, lightly trimmed. The whole shape comes straight from the repo's CONFIGURATION.md, but I'll annotate the parts that matter:

~/.config/oc-go-cc/config.json
json
{
"api_key": "${OC_GO_CC_API_KEY}",
"host": "127.0.0.1",
"port": 3456,
"hot_reload": false,

"models": {
  "default": {
    "provider": "opencode-go",
    "model_id": "kimi-k2.6",
    "temperature": 0.7,
    "max_tokens": 4096
  },
  "background": {
    "provider": "opencode-go",
    "model_id": "qwen3.5-plus",
    "temperature": 0.5,
    "max_tokens": 2048
  },
  "think": {
    "provider": "opencode-go",
    "model_id": "glm-5.1",
    "temperature": 0.7,
    "max_tokens": 8192
  },
  "long_context": {
    "provider": "opencode-go",
    "model_id": "minimax-m2.7",
    "temperature": 0.7,
    "max_tokens": 16384,
    "context_threshold": 80000
  }
},

"fallbacks": {
  "default": [
    { "provider": "opencode-go", "model_id": "glm-5" },
    { "provider": "opencode-go", "model_id": "qwen3.6-plus" }
  ],
  "long_context": [
    { "provider": "opencode-go", "model_id": "minimax-m2.5" }
  ]
}
}

A few notes on what each block actually does:

  • "api_key": "${OC_GO_CC_API_KEY}" is variable interpolation, not the literal string. The proxy reads the env var at startup. If you'd rather paste the key inline, replace the whole string with "sk-opencode-...".
  • models.default is the model used when no other route matches. Kimi K2.6 has been the best quality-to-cost trade for me on the Go plan. If you mostly write small features, swap it for qwen3.6-plus and watch your bill drop.
  • models.background is the cheap path. Anything Claude Code thinks of as a side task (file reads, greps, directory listings) goes here. Qwen3.5 Plus is the cheapest option on the menu.
  • models.long_context only kicks in when the request crosses context_threshold. The default is 80K tokens. If you regularly paste huge files, the MiniMax M2.7 family is the only one with the headroom to hold them.
  • fallbacks are tried in order if the primary fails. The circuit breaker opens after 3 consecutive failures on a single model and skips it for 30 seconds.

The full config supports a lot more (complex, fast, per-model model_overrides that route specific Anthropic model IDs through Zen instead of Go), and the CONFIGURATION.md in the repo is the source of truth. The block above is the minimum that gives you a usable setup.

Start the proxy

Foreground first, so you can see the logs and confirm it actually starts:

shell.sh
bash
oc-go-cc serve

You should see a line that looks roughly like listening on 127.0.0.1:3456. Leave it running and open a second terminal for the next step.

Once you trust it, switch to background mode and let it run from your shell startup:

shell.sh
bash
oc-go-cc serve -b

To check whether it's already up, stop it cleanly, or list the models it can see:

shell.sh
bash
oc-go-cc status
oc-go-cc stop
oc-go-cc models

On macOS, there's also oc-go-cc autostart enable, which registers it with launchd so it comes up on login. On Linux you can wire it into systemd by hand, or just leave the Docker container running with --restart unless-stopped. Same effect.

Point Claude Code at the proxy

This is the part where my old .env was wrong. Claude Code looks at two environment variables to decide where to send its traffic:

shell.sh
bash
export ANTHROPIC_BASE_URL=http://127.0.0.1:3456
export ANTHROPIC_AUTH_TOKEN=unused

A few non-obvious things about this:

  • ANTHROPIC_BASE_URL is the host of the proxy, not the upstream OpenCode URL. The whole point of the proxy is that the CLI no longer talks to OpenCode directly.
  • ANTHROPIC_AUTH_TOKEN is the token Claude Code sends as its bearer credential. The proxy ignores it, because the real auth happens between the proxy and OpenCode using OC_GO_CC_API_KEY. You can set it to literally any non-empty string. unused is convention.
  • If you had ANTHROPIC_API_KEY set from a previous attempt, unset it. The CLI prefers ANTHROPIC_API_KEY over ANTHROPIC_AUTH_TOKEN in some versions, and if it's pointing at the wrong key you'll get the 401 retry loop I lost an hour to.
shell.sh
bash
unset ANTHROPIC_API_KEY

Drop the two exports into your shell profile so they persist, source it, then run:

shell.sh
bash
claude

If it opens to a working prompt and responds to "hello," you're done. Run a real task to confirm the routing works. Ask it to read a file (that should hit Qwen3.5 Plus), then ask it to "think through a refactor" of something (that should route to GLM-5.1).

What broke for me, and what fixed it

For the version of me who lost two hours, here are the three things that mattered.

1. The 401 was almost always the wrong base URL

The first time I hit 401, I assumed it was the API key. It wasn't. The CLI was still pointing at the old free-tier URL I had in my .env, the proxy wasn't running yet, so the CLI was actually talking to the public Anthropic endpoint with an OpenCode key. Anthropic, correctly, refused.

The signal that this is your problem: the 401 comes back almost instantly, with no retry loop, and the error mentions the Anthropic API by name. Fix it by checking what the CLI is actually pointing at:

shell.sh
bash
echo $ANTHROPIC_BASE_URL

If it says anything other than the proxy's address, your shell profile didn't reload. Open a new terminal or source your dotfiles.

2. The retry loop was the proxy not running

The second symptom looks completely different. The CLI hangs for a beat, then prints a "connection refused" or "connection reset" error, then waits a couple of seconds and tries again. It will do this forever if you let it.

This is the proxy not being up. The fix is either:

shell.sh
bash
oc-go-cc status
oc-go-cc serve -b

Or, if you're on Docker, the container has stopped:

shell.sh
bash
docker ps -a
docker start oc-go-cc

I now have oc-go-cc status in my .zshrc as part of my prompt's right-hand info block, just so I always see whether it's alive.

3. "All models failed" means your key is the problem

If the proxy is running and the CLI still errors out, but with a message that mentions models failing, the proxy reached OpenCode and OpenCode said no. The fastest way to confirm it's your key (and not, say, the upstream being down) is to ask OpenCode directly:

shell.sh
bash
curl -H "Authorization: Bearer $OC_GO_CC_API_KEY" \
  https://opencode.ai/zen/go/v1/models

If you get a JSON list of models back, your key is fine and you're hitting a quota or a network issue. If you get a 401 from that curl, the key is wrong or revoked. Regenerate it in the dashboard and re-export.

For everything else, crank the log level on the proxy and watch what's actually going over the wire:

shell.sh
bash
OC_GO_CC_LOG_LEVEL=debug oc-go-cc serve

The debug output prints the raw request, the rewritten request, and the upstream response. About 90% of the time, one of those three has something obviously wrong with it.

A few things I changed after a week of using it

After a few days of real work on this setup, I tweaked the defaults in a couple of ways that have stuck:

  • I dropped temperature to 0.3 on the default route. The 0.7 default is fine for general chat, but for code edits inside Claude Code I want determinism. Lower temperature, fewer "creative" rewrites of code that was already correct.
  • I switched background to qwen3.5-plus and never looked back. It costs almost nothing and it's plenty for the kinds of tasks Claude Code routes there (greps, reads, ls).
  • I added Zen overrides for the times I want Claude itself. When I genuinely want Claude Sonnet for a sensitive refactor, I use the model_overrides block to route claude-sonnet-4-6 through Zen on a pay-as-you-go basis. Most days I don't, but it's there.

Here's the override I'm running. Drop it into the same config file alongside the other top-level keys:

model_overrides (snippet)
json
"model_overrides": {
"claude-sonnet-4-6": {
  "provider": "opencode-zen",
  "model_id": "claude-sonnet-4-6",
  "temperature": 0.3,
  "max_tokens": 8192,
  "vision": true
}
}

You can mix Go and Zen freely. The proxy doesn't care which one a request goes to, as long as both are reachable.

Is it worth it?

For my workflow, yes. Easily. I went from paying nothing (while it lasted) to ten dollars a month for a setup that gives me Kimi K2.6 as a default, the MiniMax M3 family for long-context work, and the option to escalate to Claude on Zen when I need it. The Go plan plus this proxy has cost me less in a month than a single afternoon of Claude API usage at list price.

There are real trade-offs. The open models are not Claude. For a tricky algorithmic refactor or a long, weird debugging session, I still reach for Claude Sonnet through Zen. But for the 80% of work that is "ship this feature, write the test, fix this bug," the Go-plan models are good enough that I stopped noticing the swap within a week.

The whole reason this post exists is that the gap between "I have a Go plan" and "Claude Code is using my Go plan" is bigger than it should be, and the fix isn't documented in any one place. If you hit the same wall I did, the answer is oc-go-cc, the config above, and unset ANTHROPIC_API_KEY so the CLI stops picking the wrong credential.

If something here didn't work for you, send me a note. I'd rather collect a few good answers in one place than have the next person waste an afternoon scrolling through Discord & Reddit threads, like I did.

Get new posts, in your inbox.

No list, no spam, no resale. Pick the categories you actually read, unsubscribe with a single click.

Which posts should I email you about?

Double opt-in · one-click unsubscribe · no tracking pixels

{/}

∑

// signature2026

Shravan Bhati

Built with careThank you for visiting