here is the document https://developers.cloudflare.com/workers-ai/platform/limits/
here is the document https://developers.cloudflare.com/workers-ai/platform/limits/
Cloudflare Docs
Limits | Cloudflare Workers AI docs
Workers AI is now Generally Available. We’ve updated our rate limits to reflect this.
12 Replies
plans / date to support llama 3.2 3B ?
ideal model for cloudflore (size)
Unknown User•3mo ago
Message Not Public
Sign In & Join Server To View
already here https://playground.ai.cloudflare.com/
https://forms.gle/h7FcaTF4Zo5dzNb68 not open yet 🙃
Google Forms: Sign-in
Access Google Forms with a personal Google account or Google Workspace account (for business use).
Unknown User•3mo ago
Message Not Public
Sign In & Join Server To View
🎉
anyone know why the local example code takes 25 seconds to respond?
oh it might be the model
no still slow 20 seconds minimum
this is when i make a request to groq...
Groq doesn’t use gpus so it’s obviously way faster but 25 seconds is indeed quite long
well which other api isnt really the issue - cloudflare ai is :D
just saying, a remote api request takes 500ms vs cloudflare ai, no matter which model, takes 20+ seconds
and im not sure if that is me or if that is normal
Is it me or did the LLM pricing increase with the newer pricing model? Some models were $0.11 like mixtral, now $0.15 with that new bracket. Anyways, I don't think I'm using Workers AI anytime soon—But I did like a lot of the other updates like no longer being billed for service binding invocations (as it always should've been), so now I'm encouraged to use them rather than package/link every microservice as a library in a single build to avoid the previous anti-microarchitecture model lol.
I've looked up
llama-3.1-8b-instruct
, and they seem all beta models. Maybe that's why? Try another modelalrady said that, doesnt matter which model