How control the LLM model parameters when using the fast API?

suslovwebhero · September 12, 2024, 7:52pm

How can I control these llm parameters when using the fast api?
I only can set the temperature, top_p and stop.

max_tokens
min_p
typical_p
frequency_penalty
presence_penalty
repeat_penalty
top_k

Best Regards

coby.adams · September 12, 2024, 9:53pm

I do understand you requirement for those parameters

You can do max_tokens

import os
import openai

client = openai.OpenAI(
    api_key=os.environ.get("API_KEY"),
    base_url="https://api.sambanova.ai/v1",
)

response = client.chat.completions.create(
    model='Meta-Llama-3.1-405B-Instruct',
    messages=[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Hello"}],
    temperature =  0.1,
    top_p = 0.1,
    max_tokens =5
)

print(response.choices[0].message.content)

>>> 
>>> print(response.choices[0].message.content)
Hello! It's nice
>>>

I will file an enhancement request for the others and let you know the feedback.

suslovwebhero · September 17, 2024, 1:42pm

Thank you.
I hope to control others as soon as possible.

Topic		Replies	Views
Very inconsistent inference speeds Hackathons dev	6	113	November 15, 2024
401 Authorization Required SambaNova Devs dev	5	110	September 11, 2024
500 internal server error SambaNova Devs	14	282	September 17, 2024
Not able to use ChatOpenAI from langchain_openai SambaNova Devs dev-interest , dev	1	163	September 10, 2024
Benchmark AI Starter Kit getting 429 rate limit errors more frequently than it should Tutorials & Starter Kits dev	3	257	December 4, 2024

How control the LLM model parameters when using the fast API?

Related topics