here is the document https://developers.cloudflare.com/workers-ai/platform/limits/

Cloudflare Docs
Limits | Cloudflare Workers AI docs
Workers AI is now Generally Available. We’ve updated our rate limits to reflect this.
12 Replies
scotto
scotto•5mo ago
plans / date to support llama 3.2 3B ? ideal model for cloudflore (size)
Unknown User
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
crazyjack12
crazyjack12•5mo ago
Google Forms: Sign-in
Access Google Forms with a personal Google account or Google Workspace account (for business use).
Unknown User
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
akazwz
akazwz•5mo ago
🎉
Keebs
Keebs•5mo ago
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
prompt: "What is the origin of the phrase Hello, World",
});
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
prompt: "What is the origin of the phrase Hello, World",
});
anyone know why the local example code takes 25 seconds to respond? oh it might be the model no still slow 20 seconds minimum
Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)
Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)
this is when i make a request to groq...
Bibi
Bibi•5mo ago
Groq doesn’t use gpus so it’s obviously way faster but 25 seconds is indeed quite long
Keebs
Keebs•5mo ago
well which other api isnt really the issue - cloudflare ai is :D just saying, a remote api request takes 500ms vs cloudflare ai, no matter which model, takes 20+ seconds and im not sure if that is me or if that is normal
Beyondo
Beyondo•5mo ago
Is it me or did the LLM pricing increase with the newer pricing model? Some models were $0.11 like mixtral, now $0.15 with that new bracket. Anyways, I don't think I'm using Workers AI anytime soon—But I did like a lot of the other updates like no longer being billed for service binding invocations (as it always should've been), so now I'm encouraged to use them rather than package/link every microservice as a library in a single build to avoid the previous anti-microarchitecture model lol.
Beyondo
Beyondo•5mo ago
I've looked up llama-3.1-8b-instruct, and they seem all beta models. Maybe that's why? Try another model
No description
Keebs
Keebs•5mo ago
alrady said that, doesnt matter which model

Did you find this page helpful?