Was thinking about this the other day while running Claude Code. Back in the 60s and 70s, computers were too expensive for most people, so offices had dumb terminals hooked up to a central mainframe. All the heavy lifting happened somewhere else, and the box on your desk was basically a keyboard with a screen. Then PCs got cheap and powerful through the 80s, client-server architecture took over in the 90s, and most workloads migrated off the mainframe and onto kit you actually owned. (Mainframes never fully died — banks and airlines still run them — but they stopped being how most of us worked.) Fast forward to now. Gigabit fibre, low-latency everything, and what am I doing? Sitting in a terminal, typing prompts that get shipped off to a datacenter so a model bigger than anything I could run at home can write my code for me. My laptop is genuinely more idle running Claude Code than it was compiling C++ in 2005. The terminal is back. The mainframe is back. We just call it “the cloud” and the green phosphor got replaced with syntax highlighting. What I’ve noticed is the same old mainframe-era trade-offs are creeping back into my actual workflow: No persistent state between sessions. I’ve ended up writing increasingly detailed CLAUDE.md files and leaning on MCP servers to give the “mainframe” some context about my project. Connection = productivity. A flaky link or an Anthropic blip and I’m dead in the water. Never used to care about my uptime this much when everything ran locally. What goes over the wire matters. I think harder now about what context to send vs. keep local — secrets, proprietary code, big binaries. Same calculus as deciding what to run on the mainframe vs. the terminal back in the day. Billed by usage, not ownership. Tokens instead of CPU-hours, but the economic model is identical. The scrappy local alternative is back too, except now it’s Ollama and llama.cpp instead of the Altair and the Apple II. So a few questions for the sub: Do any of you keep a local model (Ollama, llama.cpp, etc.) as a fallback for when Claude Code is down or rate-limited, or do you just accept the dependency? How are you handling the no-persistent-state problem — pure CLAUDE.md, custom slash commands, MCP servers, something else? Anyone running a hybrid setup where a local model handles grunt work (boilerplate, renames, simple refactors) and Claude Code handles the hard stuff? Curious whether we’ll swing back toward local once consumer hardware catches up, or whether centralised AI coding is just the natural endgame given how expensive frontier training is. submitted by /u/CryptoExo
Originally posted by u/CryptoExo on r/ClaudeCode
