We need to support non-stream completion in the API as well. Currently, if the stream=True
is not specified, API returns error 400:
code:
completion = client.chat.completions.create(
model="Meta-Llama-3.1-8B-Instruct",
messages=[
{"role": "user", "content": prompt}
],
# stream=True,
**kwargs,
)
stacktrace:
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'code': None, 'message': 'Current API supports stream mode. Add "stream": true in the payload', 'param': None, 'type': 'unsupported_config_error'}
This might break some existing agent applications and not provide a smooth user experience.