Thank you for the quick reply! I’m thrilled to hear about the planned context window expansion and potential for prompt caching with the RDU architecture.
Regarding my use case: I extensively use Cline with Claude 3.7 Sonnet and o3-mini for Golang development. Cline adds significant context from my editor, resulting in a 10:1 to 15:1 ratio of cache reads to fresh input tokens with OpenAI. On busy days, this has translated to millions of tokens as cache reads, making prompt caching extremely valuable for both cost savings and reduced latency - the latter being particularly important on obsolete GPU-based platforms that lack the inherent performance advantages of your RDU-based platform
My project is sensitive in nature (potential trade secrets), so I’m closely following your privacy thread. How my prompts and responses will be used is critical - if they’re stored or used by SambaNova or SambaNova’s partners for purposes beyond simply ingesting my prompts and serving responses, I’d need to restrict which code components I work on via your API. I understand that in some cases, prompts which are flagged for potential abuse / AUP violations may be retained for investigative purposes - this is not a dealbreaker for me with Anthropic or OpenAI and this type of clause would not be a dealbreaker for me with SambaNova either. I’m not planning on doing anything naughty, after all.
If SambaNova can maintain current (or close to current) DeepSeek-V3-0324 pricing while expanding to 128k context, adding prompt caching, and ensuring strong data privacy, your platform’s extreme hardware advantages paired with the inclusion of these important API features would result in an unmatched value proposition for my use case. I’d be eager to become both a customer and advocate if that’s where SambaNova is headed.
P.S. The technical superiority of SambaNova’s RDU is truly remarkable - this brilliant design seems to outclass not only traditional GPUs but also other ‘Post-GPU’ architectures. From what I understand, you’re achieving with a single rack what would take Groq 9 racks or Cerebras 4 racks to match. Kudos once again to the whole SambaNova team for this truly innovative engineering feat!