Fireworks feels more sluggish the last few weeks. Both TTFT and overall throughput seem degraded compared to when I first started using them a few months ago. I have a side project that’s running a mix of Deepseek and Mixtral. The volume is minimal but latency spikes are frequent enough that I’m wondering if they have capacity issues or if something changed on their end. Their status page is always green, so I’m not sure what the deal is. I like their model selection but raw speed is non-negotiable for what I’m building. I need sub-second TTFT for it to work properly. What are some alternatives for fast, affordable inference on open-weight models? submitted by /u/oh_kayeee
Originally posted by u/oh_kayeee on r/ArtificialInteligence
You must log in or # to comment.
