DeepSeek-V3.1 Is Live on SambaCloud!

Just last week, DeepSeek dropped their latest model: DeepSeek-V3.1. Launching today, DeepSeek-V3.1 is now available for developers on SambaCloud — running over 200 tokens/second!

3 Likes

For those interested in the python version, you can use the extra_body parameter in order to swap between thinking an non-thinking in the OpenAI SDK

response = client.chat.completions.create(
model=“DeepSeek-V3.1”,
messages=[{“role”: “system”,“content”: “You are a helpful code generator. Provide only code without explanation unless requested.”},
{“role”: “user”,“content”: “Hello World”},],
extra_body={“chat_template_kwargs”: { “enable_thinking”: False } }
)
1 Like

Working great for standard chat completions for me - but for tool calling i get a lot of different responses to DeepSeek v3 - for example;

With reasoning set to false - the response format seems different;

DeepSeek v3

{“choices”: [{“delta”: {“tool_calls”: [{“function”: {“name”: “greeting”,“arguments”: “{"message": "Hello!"}”}}]}}]}

DeepSeek v3.1

{“choices”: [{“delta”: {“content”: “{\n  "name": "greet_user",\n  "parameters": {\n    "name": "David",\n    "location": "Sydney"\n  }\n}”}}]}

also getting cases where it mixes tool calls with content, and sometimes adds garbage characters after JSON

anybody else have this? doesnt seem like a drop in replacement for v3

Looks like the team is working on fixes in the deployment which should hopefully help here.

got it working - details here - DeepSeek-V3.1 - An approach for thinking and tool use - and confirming that its a 32k context window

1 Like