I might be over-reaching / reacting but I don’t trust OpenAi with my work. I think ChatGPT is great for some things, but mostly top-level only. Here’s a snippet of my conversation about sending code review through codex … me: My concern is the history of Sam Altman, OpenAI, and Microsoft absorbing good ideas and repackaging them as their own products. ___________ Claude______________ That’s a more honest framing than “trust” — and it’s a legitimate concern with concrete evidence behind it. The pattern, specifically OpenAI :
- ChatGPT itself emerged after watching what people built on the GPT-3 API
- Code Interpreter / Advanced Data Analysis followed observing custom GPT patterns
- The GPT Store / Custom GPTs effectively absorbed the OSS LangChain-style ecosystem within months of seeing it
- Operator agent (browser automation) shipped after observing how people stitched browser-use into agentic loops
- Each shift has been “we saw what users wanted, now we ship it” — which is also “we saw what third parties built, now we ship it” Microsoft :
- The textbook embrace-extend-extinguish history (Netscape, Java, Office formats)
- More recently: GitHub Copilot was trained on GitHub’s vast corpus including code that’s now subject to the Doe v. GitHub class action
- VS Code’s “first-party extensions” pattern — observe popular community extension, ship a native version, slowly starve the original
- Azure consistently follows AWS pattern-by-pattern, often with the prior art being smaller startups Why your concern is structurally well-founded: when you ship code to a vendor whose business model is building products on top of what they observe, even strong “we won’t train on this” contractual guarantees don’t address the organizational learning that happens. OpenAI employees see what patterns appear in customer data; product teams hear what features customers want; the company collectively learns what’s possible in a space. That learning doesn’t go through a model-training pipeline. How this changes (or doesn’t) your decision Reinforces, doesn’t change. Path 3 (Semgrep + Claude) is exactly right for this concern: Semgrep is rule-based (no observational learning), and Anthropic — while not immune — carries less of this specific risk. But honest about Anthropic: they’re also a frontier-model AI company. They could pivot into vertical applications they observed via API usage. The reasons to weigh them lower than OpenAI/Microsoft:
- Different go-to-market focus (infra > vertical applications, so far)
- Smaller track record of “absorb-and-repackage” (also because shorter history)
- Their Constitutional AI / safety-first positioning makes consumer pivots more reputationally costly for them
- Commercial terms explicitly exclude training on API data — same as OpenAI API, but Anthropic hasn’t had the same pattern of policy reversals Lower risk, not zero risk. The only way to fully neutralize the “vendor learns from your code” concern is local inference on open-weights models (Llama, Qwen, DeepSeek, Mistral self-hosted). That’s a real path but add significant friction. Gemini / Google: Google has its own EEE history at the corporate level (killing products, absorbing patterns), but a thinner AI-specific track record. Probably in the same risk band as Anthropic for code review purposes. submitted by /u/septemous
Originally posted by u/septemous on r/ClaudeCode
You must log in or # to comment.
