How control the LLM model parameters when using the fast API?

How can I control these llm parameters when using the fast api?
I only can set the temperature, top_p and stop.

max_tokens
min_p
typical_p
frequency_penalty
presence_penalty
repeat_penalty
top_k

Best Regards

1 Like

@suslovwebhero

I do understand you requirement for those parameters

You can do max_tokens

import os
import openai

client = openai.OpenAI(
    api_key=os.environ.get("API_KEY"),
    base_url="https://api.sambanova.ai/v1",
)

response = client.chat.completions.create(
    model='Meta-Llama-3.1-405B-Instruct',
    messages=[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Hello"}],
    temperature =  0.1,
    top_p = 0.1,
    max_tokens =5
)

print(response.choices[0].message.content)

>>> 
>>> print(response.choices[0].message.content)
Hello! It's nice
>>> 

I will file an enhancement request for the others and let you know the feedback.

Thank you.
I hope to control others as soon as possible.