Cloudflare Developers•8mo ago

here is the document https://developers.cloudflare.com/workers-ai/platform/limits/

Cloudflare Docs

Limits | Cloudflare Workers AI docs

Workers AI is now Generally Available. We’ve updated our rate limits to reflect this.

12 Replies

scotto•8mo ago

plans / date to support llama 3.2 3B ? ideal model for cloudflore (size)

Unknown User•8mo ago

Message Not Public

cookie•8mo ago

already here https://playground.ai.cloudflare.com/

crazyjack12•8mo ago

https://forms.gle/h7FcaTF4Zo5dzNb68 not open yet 🙃

Google Forms: Sign-in

Access Google Forms with a personal Google account or Google Workspace account (for business use).

Unknown User•7mo ago

Message Not Public

akazwz•7mo ago

🎉

Keebs•7mo ago

const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
        prompt: "What is the origin of the phrase Hello, World",
      });

const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
        prompt: "What is the origin of the phrase Hello, World",
      });

anyone know why the local example code takes 25 seconds to respond? oh it might be the model no still slow 20 seconds minimum

Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)

Response time: 716 milliseconds
[wrangler:inf] GET / 200 OK (739ms)

this is when i make a request to groq...

Bibi•7mo ago

Groq doesn’t use gpus so it’s obviously way faster but 25 seconds is indeed quite long

Keebs•7mo ago

well which other api isnt really the issue - cloudflare ai is :D just saying, a remote api request takes 500ms vs cloudflare ai, no matter which model, takes 20+ seconds and im not sure if that is me or if that is normal

Beyondo•7mo ago

Is it me or did the LLM pricing increase with the newer pricing model? Some models were $0.11 like mixtral, now $0.15 with that new bracket. Anyways, I don't think I'm using Workers AI anytime soon—But I did like a lot of the other updates like no longer being billed for service binding invocations (as it always should've been), so now I'm encouraged to use them rather than package/link every microservice as a library in a single build to avoid the previous anti-microarchitecture model lol.

Beyondo•7mo ago

I've looked up llama-3.1-8b-instruct, and they seem all beta models. Maybe that's why? Try another model

Keebs•7mo ago

alrady said that, doesnt matter which model

Gaming

Programming

here is the document https://developers.cloudflare.com/workers-ai/platform/limits/

Did you find this page helpful?