• kingofras@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 hours ago

    I routed all my Claude Code traffic through a local proxy for 3 months. Here’s what I found.

    I use Claude Code a lot across multiple projects. A few months ago I got frustrated that I couldn’t see per-session costs in real time, so I set up a local proxy between Claude Code and the API that intercepts every request.

    After 10,000+ requests, three things surprised me:

    1. Session costs vary wildly. My cheapest session this week: $0.19 (quick task, pure Sonnet). Most expensive: $50.68 (long planning sessions with research, code review, and a lot of Opus). Without per-session tracking, these just blur into one weekly number.

    r/ClaudeCode - I routed all my Claude Code traffic through a local proxy for 3 months. Here's what I found.

    2. A meaningful chunk of requests come in bursty patterns I wouldn’t have noticed otherwise. Sub-500ms gaps between requests, often when I wasn’t actively prompting. Whether that’s auto-memory, caching prefills, or something else, it adds up and it’s invisible without intercepting the traffic.

    3. Routing simple tasks to Sonnet saves real money. I classify requests by complexity heuristics and route simple ones to Sonnet instead of Opus. Over 10K requests, that produced a 93% cost reduction under my usage patterns (including cache hits). This doesn’t prove equal quality on every routed call, but for the simple stuff (short context, straightforward tasks), it held up well enough to be worth it for me.

    You could also route simple tasks to Haiku for even more savings, but would need to fund an API account since Haiku isn’t included in the Anthropic Max plan.

    r/ClaudeCode - I routed all my Claude Code traffic through a local proxy for 3 months. Here's what I found.

    I open-sourced it in case it’s useful: @relayplane/proxy. It runs locally and gives you a live dashboard at localhost:4100.

    Not a replacement for ccusage, that’s great for post-hoc analysis. This sits in the request path and shows you costs live, mid-session.

    Happy to answer questions about the setup or what I’ve learned about Claude Code’s request patterns.