Hello everyone, I am using SambaNova Fast Api.
Does fast api provide the prompt caching?
If yes, could you let me know how I can implement?
Thank you.
Technically, there’re many ways, most http clients provide option for caching request based on body. I feel, like ideally Sambanova is intended for projects, where you have to utilize fast but very dissimilar prompts.
You can also use proxy, like nginx or manually store responses into KV, memory or Vector storage
@suslovwebhero and @gothesopre Prompt caching is indeed on the road map but we do not have an ETA as of yet.
-Coby
hello, is there any update regarding prompt caching on SambaNova ?
This would be really helpful. Waiting for this
Please add support for this - if you have to pay 100% of for input tokens, sambanova cannot be used for any agentic flows with tool calling. Tool calling is the future (and present)