Hello everyone, I just tried the inference and I’m really impressed by the speed—it’s nearly instant. However, the lack of detailed rate limits is a bit challenging for me, as it makes it hard to gauge whether the endpoint can scale when analyzing datasets. Would it be possible to get some basic information on the rate limits for the current tiers, as well as the upcoming ones? Terms like “low” or “very low” are a bit unclear. I understand these are new products, but having more precise information, especially for the developer tier (which could be the most relevant for me in the future), would be very helpful. Thanks in advance!
4 Likes
Welcome to our community and we appreciate you using our offerings and taking the time to provide valuable feedback .
We do understand your frustration and requirements. Clarification will be coming soon. I will ensure to update this thread with the proper pointers as soon as we publish them.
Thank you for your patience,
-Coby
@coby.adams please update , i have been looking since a month.
“low-rate limits” is very subjective and not quantitative.
there is a workload to summarize around 50 news items, im still in confusion wether samba will handle or 429
@nikhilswami1 we published the rate limits 20 days ago . I am sorry that I did not come back and link it .
-Coby
2 Likes