Dramatically decreased performance with Llama-4 Maverick 17B 128E

shanif · October 29, 2025, 5:46pm

Hi there. Starting on Oct. 22 I started getting dramatically worse response times from Llama 4, to a point where my code would have to time out the request after waiting for up to 3 minutes. This has gotten worse in the past few days.

Is anyone else seeing the same issues?

Coby · October 29, 2025, 5:51pm

@shanif can you please provide a specific day and time window for good and bad performance so that we can pull logs . If you have specific request IDs for a good and bad perf example that would be of great use as well.

@omkar.gangan please assit on this one.

-Coby

Coby · October 29, 2025, 11:20pm

@shanif can you test again please. There were some queue modifications to favor enterprise accounts . We have made some slight adjustments which should make it bit better for non-enterprise.

-Coby

shanif · October 30, 2025, 2:53pm

Will take a look and report back, appreciate the reply

shanif · October 30, 2025, 4:07pm

Unfortunately this still seems to be a problem, at least on my end. I just had another request timeout.

Appreciate your assistance with resolving the issue… Sambanova has been my primary LLM provider for the past 6 months and I’ve had to switch to Groq but would love to come back, as I found your models’ answers to be more accurate.

shanif · November 1, 2025, 2:18pm

This seems to be resolved now. Thanks for handling!