Hello everyone,
I’m truly enthusiastic about your product; the inference speed and ability to host an extremely large model like Llama 405B is absolutely stellar!
I’m currently developing a platform that allows users to generate stories for children’s books. The platform features input and output validation using fast, lightweight LLMs. Right now, I’m working with Llama 3.1 70B on Groq, but I would like to test it with Llama 4.0 5B using SambaNova. However, it seems that SambaNova doesn’t support JSON mode at this time.
While I can prompt for JSON-structured output, the consistency of the output is not sufficient for application development. Sometimes the output starts as plain JSON, other times it includes leading text and special characters, or it may not adhere to the correct JSON structure throughout. For example, I’m requesting a JSON object structured as a list of 17 dictionaries, each containing 2 items like this:
[
{
"id": 0,
"content": "this content part is usually around 3-4 sentences"
},
{
"id": 1,
"content": "this content part is usually around 3-4 sentences"
}
]
I could implement custom parsing for the output to replicate the desired format, but this would require significant code refactoring. Ideally, I’d prefer to swap the models for testing without making those changes.
Do you have any suggestions for a solid workaround? I find the output structure too unstable for reliable custom parsing. Additionally, is there a roadmap or an intended support date for output structure or JSON mode?
Thank you!