Just last week, DeepSeek dropped their latest model: DeepSeek-V3.1. Launching today, DeepSeek-V3.1 is now available for developers on SambaCloud — running over 200 tokens/second!
For those interested in the python version, you can use the extra_body parameter in order to swap between thinking an non-thinking in the OpenAI SDK
response = client.chat.completions.create(
model=“DeepSeek-V3.1”,
messages=[{“role”: “system”,“content”: “You are a helpful code generator. Provide only code without explanation unless requested.”},
{“role”: “user”,“content”: “Hello World”},],
extra_body={“chat_template_kwargs”: { “enable_thinking”: False } }
)
Working great for standard chat completions for me - but for tool calling i get a lot of different responses to DeepSeek v3 - for example;
With reasoning set to false - the response format seems different;
DeepSeek v3
{“choices”: [{“delta”: {“tool_calls”: [{“function”: {“name”: “greeting”,“arguments”: “{"message": "Hello!"}”}}]}}]}
DeepSeek v3.1
{“choices”: [{“delta”: {“content”: “{\n "name": "greet_user",\n "parameters": {\n "name": "David",\n "location": "Sydney"\n }\n}”}}]}
also getting cases where it mixes tool calls with content, and sometimes adds garbage characters after JSON
anybody else have this? doesnt seem like a drop in replacement for v3
Looks like the team is working on fixes in the deployment which should hopefully help here.
got it working - details here - DeepSeek-V3.1 - An approach for thinking and tool use - and confirming that its a 32k context window