If your Claude Code quota runs out and you don’t want to wait or pay for more, there’s a way to keep using the same “claude” command but route it through a free model. It takes about three minutes to set up.
How it works
There’s a small open-source proxy called “free-claude-code” that sits on your localhost. It takes Claude Code’s API calls and translates them into a format that NVIDIA’s free hosted inference platform (NIM) can serve. NIM gives you: - ~5,000 free credits on signup - 40 requests/minute That’s plenty for steady coding. The model I use is: “Kimi-K2” from Moonshot It’s a coding-tuned model that’s about 80% as good as Sonnet for normal day-to-day work. So the flow becomes: claude command ↓ local proxy ↓ NVIDIA NIM ↓ Kimi-K2 ↓ response back into Claude Code UI Same UI, different model. -–
My setup
I wrapped the whole thing in a shell command called: claude-free So I can swap freely between paid and free without touching anything. Both commands sit on my machine: - quota out → “claude-free” - quota back → “claude” They don’t share env vars and my real config is untouched. -–
What you need
You need an NVIDIA NIM API key. This is the only part you can’t automate. Go here: build.nvidia.com/settings/api-keys - sign in with Google or GitHub - click “Generate API Key” - copy the “nvapi-…” string Takes a minute. You also need: - git - node (for Claude Code itself) - uv for Python If you don’t have uv: curl -LsSf https://astral.sh/uv/install.sh | sh -–
Setup
Clone the proxy somewhere persistent. Do NOT use “/tmp” because it gets wiped on reboot. git clone --depth 1 https://github.com/Alishahryar1/free-claude-code.git ~/.local/share/claude-free cd ~/.local/share/claude-free uv python install 3.14 uv sync Write a “.env” file in the same directory: NVIDIA_NIM_API_KEY=nvapi-your-key-here MODEL=nvidia_nim/moonshotai/kimi-k2-instruct ANTHROPIC_AUTH_TOKEN=freecc -–
Important
Do NOT use the model the README ships with. The default is: z-ai/glm4.7 It hangs forever and never returns. I tried a few others too: - “deepseek-v4-pro” - “qwen3-coder-480b” Both unreachable. The two that actually work on the free tier are: - “moonshotai/kimi-k2-instruct” ← best for coding - “meta/llama-3.3-70b-instruct” ← decent backup -–
About “ANTHROPIC_AUTH_TOKEN”
ANTHROPIC_AUTH_TOKEN=freecc This is just a local password between the wrapper and the proxy. The proxy rejects requests without it. It never leaves your machine. You can set it to anything — just keep the wrapper and “.env” in sync. -–
Wrapper script
Save this as: ~/.local/bin/claude-free #!/usr/bin/env bash set -e PROXY_DIR=“$HOME/.local/share/claude-free” PORT=8082 if ! curl -s -m 1 " http://127.0.0.1:$PORT/v1/models " -H “x-api-key: freecc” >/dev/null 2>&1; then echo “claude-free: starting proxy on :$PORT…” >&2 cd “$PROXY_DIR” nohup uv run uvicorn server:app \ --host 127.0.0.1 \ --port “$PORT” \ >> “$PROXY_DIR/proxy.log” 2>&1 & for i in {1…30}; do curl -s -m 1 " http://127.0.0.1:$PORT/v1/models " \ -H “x-api-key: freecc” >/dev/null 2>&1 && break sleep 0.5 done fi export ANTHROPIC_AUTH_TOKEN=freecc export ANTHROPIC_BASE_URL=" http://127.0.0.1:$PORT " export CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY=1 exec claude “$@” Make it executable: chmod +x ~/.local/bin/claude-free Make sure “~/.local/bin” is on your PATH: echo $PATH | grep -q “$HOME/.local/bin” && echo ok || echo “add to .zshrc” Open a new terminal. Run: claude-free You’re now running Claude Code on a free model. -–
Verifying it works
If you want to confirm it’s actually routing through the proxy and not silently hitting Anthropic: Kill the proxy and run “claude-free” again. The wrapper should restart it automatically. You can also tail the logs: tail -f ~/.local/share/claude-free/proxy.log and watch requests come through. -– One thing that confused me Claude Code UI may still show your old account/model name in the header: Sonnet 4.6 or whatever you used before. That label is cached locally. The model actually serving you is Kimi. Trust the logs, not the UI. -–
The faster way
I saved this whole setup as a single npad note: https://npad.run/p/free-claude-code-in-3-minutes-claude-free-wrapper-nvidia-nim-fbh3d9p443 Paste that URL into Claude Code and say: «“do this, ask me when you need the NVIDIA key”» Your agent: - reads the note - runs every command - pauses at the human-only step - continues automatically - tests the install One shot. You’ll also save tokens because the agent won’t repeat the same mistakes mine did. -–
Caveats
“Kimi-K2” is not Opus. It’s good at coding and decent at tool use, but you’ll feel the difference on: - hard reasoning - long-context tasks Use this when: - your real quota is out - casual coding - side projects For work that matters, pay for real Claude. -–
Free tier limits
NVIDIA free tier currently gives roughly: - 40 requests/minute - ~5,000 free credits on signup Budget refresh schedule is unclear. A single Claude Code turn is usually: 5–15 requests So the rate limit mostly matters during large multi-file refactors. btw u can rotate keys hehe -–
Security note
The proxy is open source and runs on localhost. But technically every prompt passes through it on the way to NVIDIA. Don’t run this on a shared machine. -–
Uninstall
Two lines: pkill -f “uvicorn server:app” rm -rf ~/.local/share/claude-free ~/.local/bin/claude-free Nothing left behind. Your real Claude setup stays untouched. submitted by /u/Veerbhadra_1
Originally posted by u/Veerbhadra_1 on r/ClaudeCode
