Limitation questions

Hi, I had a question about the LLM limits on tokens. I saw on the docs that certain LLMs like @cf/baai/bge-base-en-v1.5 have a max token limit of 512 in and 768 out, but I didn't happen to see anything for the @cf/meta/llama-2-7b-chat-fp16 or @cf/meta/llama-2-7b-chat-int8 models. Anywhere I can see the info for those models?
Cloudflare Docs
Text Embeddings · Cloudflare Workers AI docs
Feature extraction models transform raw data into numerical features that can be processed while preserving the information in the original dataset. …
3 Replies
user
userOP10mo ago
Please ping me when you reply 🙏
DaniFoldi
DaniFoldi10mo ago
Hi :meowwave:, on the Text Generation page https://developers.cloudflare.com/workers-ai/models/text-generation/ you should see the limit for -fp16 is
Default max (sequence) tokens (stream): 2500 Default max (sequence) tokens: 256 Context tokens limit: 3072 Sequence tokens limit: 2500
and for -int8
Default max (sequence) tokens (stream): 1800 Default max (sequence) tokens: 256 Context tokens limit: 2048 Sequence tokens limit: 1800
user
userOP10mo ago
Ohh its on there? The "terms" link on the model catalogue links you to the link I sent lul ty very much
Want results from more Discord servers?
Add your server