When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers
When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers everything.
17 Replies
@Kathy | AI Gateway PM AI Gateway is forcing
v1/
on Vertex AI, even if I pass in v1beta1/
how to use workers-ai
Are you talking about how to use workers ai with ai gateway? Or how to use workers ai in the first place?
idk what that is so i am thinking its some kind of AI service endpoint smth that we can use 😅
Yes, workers ai is cloudflare's ai service and have their own and (recently added) openai compatible endpoint. However, I would suggest to move further conversation about it to #workers-ai. Also in that channel, if you click at the description of the channel at the top, there's a link that explains more about it and how it works
what's your pfp 💀
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
yes hopefully soon
am I doing something wrong, or did Azure OpenAI embeddings through AI Gateway break?
oh wait I am doing something wrong I think
looks like LiteLLM
ya, LiteLLM bug. Sorry! https://github.com/BerriAI/litellm/pull/4629
I discovered that VertexAI Claude stream waits for a long time and returns all SSE content at once
You have to use ?alt=sse
Thanks! You fixed the problem!
Hello, when I use AI Gateway in Worker, will I forward the real request user IP -related information?
Would anyone happen to have a working code example for using Vertex AI with AI Gateway in a worker? The docs only show a curl request, and it's not clear to me how to set the url or do authentication using
@google-cloud/vertexai
-- or if it's even possible.How is it judged in AI Gateway when using cache or not? My prompt has only small differences, but is often judged to be cached.
Is it possible to set the cache to hit only if the prompts are all consistent?
Are you saying you're seeing cache hits when you shouldn't be seeing cache hits?