Hello again, SambaNova community!
I am requesting the addition of a new model that goes by the name DeepCoder.
Ollama describes it as “DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.”
This chart was provided, showing it does indeed appear to be competitive with o3-mini with low reasoning effort:
I have been testing out the 14B model locally on my RTX 4060 Ti 16GB, which can fit the model by itself (9.0GB) comfortably enough. The coding capabilities indeed appear to be on-par with o3-mini on low, but it gets even better - this model is FANTASTIC at tool calling with Cline, as well! Woohoo! This is the exact kind of model I’ve been eagerly waiting for
Unfortunately, utilizing a larger context window results in memory requirements far beyond what my GPU can handle (100k context window adds 31.0GB memory usage, resulting in 63%/37% CPU+RAM / GPU+VRAM allocation - ollama handles this hand-off automatically). Unfortunately, my Ryzen 5 3600X + DDR4 RAM doing much of the heavy lifting results in a painfully slow throughput, as you can see in this animated gif image (recorded at 5 fps, no speed modifications made to recording, this is real-time performance):
This appears to be the EXACT kind of model that SambaNova can turn into a true gem for my needs: tiny open model, huge performance, but too slow to run on my hardware. Help me SambaNova, bring this DeepCoder model to life on your platform!
References: