any idea?

any idea?
18 Replies
Peps
Peps•6mo ago
I noticed the reported cost for the new gpt-4o mini model is incorrect
Peps
Peps•6mo ago
No description
Peps
Peps•6mo ago
the pricing for the model is $0.15/$0.60 per million tokens for input and output respectively
Kathy
Kathy•6mo ago
ty for callout - we're working on fixing the openai costs (+ adding costs in from other providers)
Unknown User
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
Kathy
Kathy•6mo ago
this should be fixed - please let us know if you see a problem still
Unknown User
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
Kathy
Kathy•6mo ago
You can now use Google AI Studio with AI Gateway! https://developers.cloudflare.com/ai-gateway/providers/google-ai-studio/
Cloudflare Docs
Google AI Studio · Cloudflare AI Gateway docs
Google AI Studio helps you build quickly with Google Gemini models.
wayne
wayne•6mo ago
is anyone else getting an issue with 429 Too many requests when using cloudflare ai gateway? volume is very low but getting that error response when hitting gateway.ai.cloudflare.com endpoint for google vertex ai
Fokke
Fokke•6mo ago
We are playing around with larger models on huggingface through the AI gateway but we are running into a bit of an issue. From what I can tell from the example (https://developers.cloudflare.com/ai-gateway/providers/huggingface/) the AI gateway only supports smaller < 10GB models.
"error": "The model meta-llama/Meta-Llama-3.1-70B is too large to be loaded automatically (141GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."
"error": "The model meta-llama/Meta-Llama-3.1-70B is too large to be loaded automatically (141GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."
Based on the documentation I linked above I don't see any way to directly connect to models in spaces or inference endpoints. Is that correct?
Fokke
Fokke•6mo ago
I don't think its an error so much as just a limit of what is available through the HfInferenceEndpoint without setting up a dedicated model endpoint. But adding a dedicated endpoint would as far as I can tell replace the Gateway endpoint, so not sure how you can use both. Documentation is fairly limited on the topic.
wastemaster.
wastemaster.•6mo ago
is there a way to share access to ai gateway across the team? I think previously cloudflare had ability to share full admin access to whole account now I have to change domain specific fearures but ai gateway is not domain specific - how to handle that?
Chaika
Chaika•6mo ago
if you select a specific domain or domain group, you're locked into domain specific roles. You have to select the scope as the entire account to given account level roles
wastemaster.
wastemaster.•6mo ago
yep found that, I can select between "Account Scoped Roles" and there is no ai gateway option there (
Chaika
Chaika•6mo ago
oh ok you mentioned "shared full admin access", yea I don't see any specific roles just for ai gateway, would probably just need admin
scotto
scotto•6mo ago
Question regarding cost for azure open ai
yinkiu602
yinkiu602•6mo ago
Hi, is it normal that only non-stream response when using openai are being tracked by AI gateways 🤔
kn1ght
kn1ght•6mo ago
hi

Did you find this page helpful?