Normal chatbot stuff is whatever. Most decent models can handle basic Q&A fine. But once I start doing actual multi-step tasks (browser actions, file handling, retries, long context, etc.) the difference gets kinda ridiculous. Same workflow, same prompt, totally different outcomes depending on which model I started with. I’ve had cheaper models get stuck in weird retry loops or completely lose the thread halfway through a task, while stronger ones just… finish it. Problem is I also don’t want to light money on fire by throwing the strongest model at everything. Lately I’ve been bouncing between models inside accio work depending on the task, and it’s made me realize I still don’t really have a solid instinct for where the “switch point” actually is. Like sometimes a cheaper model handles something perfectly fine, and other times it silently spirals for 15 minutes before failing in a really dumb way. Right now my approach is basically: simple stuff → cheap model, longer-running stuff → stronger model But it still feels super inconsistent. Do most people here start cheap and escalate if things break, or just start strong and avoid the headache altogether? submitted by /u/Nearby_Worry_4850
Originally posted by u/Nearby_Worry_4850 on r/ArtificialInteligence
