How can I control these llm parameters when using the fast api?
I only can set the temperature, top_p and stop.
max_tokens
min_p
typical_p
frequency_penalty
presence_penalty
repeat_penalty
top_k
Best Regards
How can I control these llm parameters when using the fast api?
I only can set the temperature, top_p and stop.
max_tokens
min_p
typical_p
frequency_penalty
presence_penalty
repeat_penalty
top_k
Best Regards
I do understand you requirement for those parameters
You can do max_tokens
import os
import openai
client = openai.OpenAI(
api_key=os.environ.get("API_KEY"),
base_url="https://api.sambanova.ai/v1",
)
response = client.chat.completions.create(
model='Meta-Llama-3.1-405B-Instruct',
messages=[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Hello"}],
temperature = 0.1,
top_p = 0.1,
max_tokens =5
)
print(response.choices[0].message.content)
>>>
>>> print(response.choices[0].message.content)
Hello! It's nice
>>>
I will file an enhancement request for the others and let you know the feedback.
Thank you.
I hope to control others as soon as possible.