Getting structured text back from image - stop_reason: length

matt.winn50 · February 25, 2025, 10:23pm

hey team,

I’m writing an app that uses the llama vision model to return structured text from an image… unfortunately I seem to be limited on output token length.

Are there any plans to increase this?

Best,

Matt

“model”: “Llama-3.2-90B-Vision-Instruct”,
“object”: “chat.completion”,
“system_fingerprint”: “fastcoe”,
“usage”: {
“completion_tokens”: 4059,
“completion_tokens_after_first_per_sec”: 105.0270045726293,
“completion_tokens_after_first_per_sec_first_ten”: 103.96115165272784,
“completion_tokens_per_sec”: 103.6753340246086,
“end_time”: 1740521727.4034967,
“is_last_response”: true,
“prompt_tokens”: 37,
“start_time”: 1740521688.2524292,
“stop_reason”: “length”,
“time_to_first_token”: 0.513385534286499,
“total_latency”: 39.15106749534607,
“total_tokens”: 4096,
“total_tokens_per_sec”: 104.62039126996719
}
}

seth.kneeland · February 25, 2025, 10:31pm

Hey Matt!
Welcome to Samba Nova.
I’ve captured your feedback and added to our priority lists.
Stay tuned. We’re working aggressively to provide more capacity.

Thanks

Seth Kneeland