Meta Llama 3.3 model

Happy Friday!

I recently saw the post about the 3.3 version of Llama released.
Any news on this version?

I will be curious about the token speed and real-life results compared to the 405b model.

Also, do you know any benchmarking method that helps compare the results?

Meta says it has the same results with a 70b size as the 3.1 405b version.

Thanks, Laszlo

11 Likes

@hello1

3.3 is available

1 Like

Thanks, Coby and the whole Team! I am starting to test with my code updates right now!

Is it possible to get beta access to the 64/128k context?

I faced this error:
Error code: 503 - {‘error’: {‘code’: None, ‘message’: ‘Meta-Llama-3.3-70B-Instruct-16k is temporarily unavailable. Please try again later!’, ‘param’: None, ‘type’: ‘’}}

I called the Meta-Llama-3.3-70B-Instruct model with my code.

Hi @hello1 , Meta-Llama-3.3-70B model currently supports a maximum context length of 4096 tokens.

Thanks & Regards

Thanks for the information.

I also got the same error as @hello1 just now on Dec 13: Error code: 503 - {'error': {'code': None, 'message': 'Meta-Llama-3.3-70B-Instruct-8k is temporarily unavailable. Please try again later!', 'param': None, 'type': ''}}. I wanted to see if 3.3 would speed up my apps Stock vs. Stock and Parallel Unime.

1 Like

@wwmcheung

You should not have -8k in your code

Meta-Llama-3.3-70B-Instruct

In my code I wrote the given model-id but in the response I got a message for a different model.

In my code I’m passing Meta-Llama-3.3-70B-Instruct as the model, I’m not using the -8k suffix. That suffix is appearing in the error message, however. Maybe you guys should check the default on your side?

2 Likes

Thank you for raising this . I’ll up a bug this evening.

1 Like

Is there an ETA for when this will be fixed? Thanks

@wwmcheung
I do not have an exact date but I can say soon. I do apologize for not being able to be more specific at this time.

-Coby

1 Like

@wwmcheung @hello1 I have tested this morning and api access seems to be working. Please try again and report back if there are any issues.

Thanks!
Seth

1 Like

Thanks, Seth,

I tested now and got the following error message in my code - the Error code comes from the API:
2024.12.18 17:23:17.288 [WARNING] - [SambaNova] - Error while generating text: Error code: 503 - {‘error’: {‘code’: None, ‘message’: ‘Meta-Llama-3.3-70B-Instruct-64k is temporarily unavailable. Please try again later!’, ‘param’: None, ‘type’: ‘’}}

Yes sorry I spoke too soon. Context lengths over 4k tokens are still throwing errors. I’ll update once this has been corrected.

1 Like

Thanks!
Until it is not larger, I will not use the new model for my code.

1 Like

Hi, any ETA on this? I tried again today and still get: Error code: 429 - {'error': {'code': None, 'message': 'Meta-Llama-3.3-70B-Instruct-8k is temporarily unavailable. Please try again later!', 'param': None, 'type': ''}} httpx.HTTPStatusError: Client error '429 Too Many Requests' for url 'https://api.sambanova.ai/v1/chat/completions'