Since the Samba API for OpenAI compatibility doesn’t support response_format
, I would like to check how to obtain the output in JSON format, as it is critical for our post-processing after performing a chat completion. Could you suggest any best practices for retrieving the output in JSON format in this case?
By specifying the prompt to generate responses in JSON format, I was able to successfully receive the output in the desired JSON structure using the following code.
from openai import OpenAI
import json
client = OpenAI(
base_url="https://api.sambanova.ai/v1",
api_key="YOUR_API_KEY"
)
# Define the prompt for JSON output
prompt = '''Provide the information in valid JSON format with no extra text. Ensure it matches this structure exactly:
{
"name": "string",
"age": "integer",
"location": "string"
}'''
query="My name is Omkar. I am from India and 26 years old."
response = client.chat.completions.create(
model="Meta-Llama-3.1-8B-Instruct",
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": query}
]
)
print("Query: ",query)
json_output = json.loads(response.choices[0].message.content)
print("Response:",json_output)
Output:
Query: My name is Omkar. I am from India and 26 years old.
Response: {'name': 'Omkar', 'age': 26, 'location': 'India'}
Query: My name is Jack. I am from USA and 34 years old.
Response: {'name': 'Jack', 'age': 34, 'location': 'USA'}
Hey @yuzhe! Are you using the python OpenAI package to do this? If I’m understanding you correctly, you want a way to get the response from the LLM in JSON format correct? If this is the case, you can do so in the following way:
- Make a call to specified LLM and save the result to a variable. The variable should be
ChatCompletion
object. - You can then call
.to_json()
on thisChatCompletion
object. - Lastly you need to read the json payload with
json.loads()
to save the json object itself to a variable.
Here is an example of how I did this:
from openai import OpenAI
import json
client = OpenAI(
base_url="https://api.sambanova.ai/v1",
api_key="YOUR API KEY"
)
prompt = "You are a helpful AI assistant"
query="Write me a script to reverse a python list"
response = client.chat.completions.create(
model="Meta-Llama-3.1-8B-Instruct",
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": query}
]
)
json_object = json.loads(response.to_json())
print(json_object)
Let me know if this helps solve your problem! Otherwise, I am happy to continue helping you out here!
Thank you for your suggestion! Your prompt has been really helpful in preventing the model from generating extra text. However, I still encounter json at the beginning and
at the end of the content, likely due to the nested JSON format of my required output format. To handle this, I’ve been using response.choices[0].message.content.replace("json", "").replace("```", "").strip()
to clean it up. After that, I apply json.loads()
, and it works now.
Hi @connor.mccormick Thanks for your advice! For my case since the content has extra string and cannot directly convert the “content” to json, making this methold doesn’t work for me. I hope SambaNova considers integrating the response_format
for OpenAI compatibility in the future, it would be quite helpful for developer!
Hi @connor.mccormick , Thank you for the suggestion, but it doesn’t solve my issue. response.to_json()
converts the model’s response structure into JSON, but it does not convert the actual content of the message into JSON format. What I need is to transform the message content itself into JSON format. See one output example below.
I see! Thank you for providing the clarification. In that case, then the suggestion from @omkar.gangan is going to be your best bet. If you need to actual tokens provided from the model to be a JSON string itself, then I would condition the model with the system prompt to respond in valid JSON format. Here is a link to his response for convenience.
Let me know if this helps? If you are still running into issues, we can continue to work together to solve your problem!
I use many times the json output for my prompts. Generally what works best, if I also add the following sentence to the prompt: output the json object only, without any commentary.
This reduces the errors dramatically.
Smaller models sometimes return invalid structures, but those also can be fixed with some reflection prompt where one generates the json, and the other validates the json structure.
I hope these helps.