here is the document https://developers.cloudflare.com/workers-ai/platform/limits/

Cloudflare Docs
Limits | Cloudflare Workers AI docs
Workers AI is now Generally Available. We’ve updated our rate limits to reflect this.
12 Replies
scotto
scotto•3mo ago
plans / date to support llama 3.2 3B ? ideal model for cloudflore (size)
Unknown User
Unknown User•3mo ago
Message Not Public
Sign In & Join Server To View
crazyjack12
crazyjack12•3mo ago
Google Forms: Sign-in
Access Google Forms with a personal Google account or Google Workspace account (for business use).
Unknown User
Unknown User•3mo ago
Message Not Public
Sign In & Join Server To View
akazwz
akazwz•3mo ago
🎉
Keebs
Keebs•3mo ago
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
prompt: "What is the origin of the phrase Hello, World",
});
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
prompt: "What is the origin of the phrase Hello, World",
});
anyone know why the local example code takes 25 seconds to respond? oh it might be the model no still slow 20 seconds minimum
Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)
Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)
this is when i make a request to groq...
Bibi
Bibi•3mo ago
Groq doesn’t use gpus so it’s obviously way faster but 25 seconds is indeed quite long
Keebs
Keebs•3mo ago
well which other api isnt really the issue - cloudflare ai is :D just saying, a remote api request takes 500ms vs cloudflare ai, no matter which model, takes 20+ seconds and im not sure if that is me or if that is normal
Beyondo
Beyondo•3mo ago
Is it me or did the LLM pricing increase with the newer pricing model? Some models were $0.11 like mixtral, now $0.15 with that new bracket. Anyways, I don't think I'm using Workers AI anytime soon—But I did like a lot of the other updates like no longer being billed for service binding invocations (as it always should've been), so now I'm encouraged to use them rather than package/link every microservice as a library in a single build to avoid the previous anti-microarchitecture model lol.
Beyondo
Beyondo•3mo ago
I've looked up llama-3.1-8b-instruct, and they seem all beta models. Maybe that's why? Try another model
No description
Keebs
Keebs•3mo ago
alrady said that, doesnt matter which model
Want results from more Discord servers?
Add your server