any idea?

any idea?
18 Replies
Peps
Peps•5mo ago
I noticed the reported cost for the new gpt-4o mini model is incorrect
Peps
Peps•5mo ago
No description
Peps
Peps•5mo ago
the pricing for the model is $0.15/$0.60 per million tokens for input and output respectively
Kathy
Kathy•5mo ago
ty for callout - we're working on fixing the openai costs (+ adding costs in from other providers)
Unknown User
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
Kathy
Kathy•5mo ago
this should be fixed - please let us know if you see a problem still
Unknown User
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
Kathy
Kathy•5mo ago
You can now use Google AI Studio with AI Gateway! https://developers.cloudflare.com/ai-gateway/providers/google-ai-studio/
Cloudflare Docs
Google AI Studio · Cloudflare AI Gateway docs
Google AI Studio helps you build quickly with Google Gemini models.
wayne
wayne•5mo ago
is anyone else getting an issue with 429 Too many requests when using cloudflare ai gateway? volume is very low but getting that error response when hitting gateway.ai.cloudflare.com endpoint for google vertex ai
Fokke
Fokke•5mo ago
We are playing around with larger models on huggingface through the AI gateway but we are running into a bit of an issue. From what I can tell from the example (https://developers.cloudflare.com/ai-gateway/providers/huggingface/) the AI gateway only supports smaller < 10GB models.
"error": "The model meta-llama/Meta-Llama-3.1-70B is too large to be loaded automatically (141GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."
"error": "The model meta-llama/Meta-Llama-3.1-70B is too large to be loaded automatically (141GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."
Based on the documentation I linked above I don't see any way to directly connect to models in spaces or inference endpoints. Is that correct?
Fokke
Fokke•5mo ago
I don't think its an error so much as just a limit of what is available through the HfInferenceEndpoint without setting up a dedicated model endpoint. But adding a dedicated endpoint would as far as I can tell replace the Gateway endpoint, so not sure how you can use both. Documentation is fairly limited on the topic.
wastemaster.
wastemaster.•4mo ago
is there a way to share access to ai gateway across the team? I think previously cloudflare had ability to share full admin access to whole account now I have to change domain specific fearures but ai gateway is not domain specific - how to handle that?
Chaika
Chaika•4mo ago
if you select a specific domain or domain group, you're locked into domain specific roles. You have to select the scope as the entire account to given account level roles
wastemaster.
wastemaster.•4mo ago
yep found that, I can select between "Account Scoped Roles" and there is no ai gateway option there (
Chaika
Chaika•4mo ago
oh ok you mentioned "shared full admin access", yea I don't see any specific roles just for ai gateway, would probably just need admin
scotto
scotto•4mo ago
Question regarding cost for azure open ai
yinkiu602
yinkiu602•4mo ago
Hi, is it normal that only non-stream response when using openai are being tracked by AI gateways 🤔
kn1ght
kn1ght•4mo ago
hi
Want results from more Discord servers?
Add your server