When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers

When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers everything.
17 Replies
dave
daveOP•6mo ago
@Kathy | AI Gateway PM AI Gateway is forcing v1/ on Vertex AI, even if I pass in v1beta1/
ShreshthTiwari
ShreshthTiwari•6mo ago
how to use workers-ai
Victor
Victor•6mo ago
Are you talking about how to use workers ai with ai gateway? Or how to use workers ai in the first place?
ShreshthTiwari
ShreshthTiwari•6mo ago
idk what that is so i am thinking its some kind of AI service endpoint smth that we can use 😅
Victor
Victor•6mo ago
Yes, workers ai is cloudflare's ai service and have their own and (recently added) openai compatible endpoint. However, I would suggest to move further conversation about it to #workers-ai. Also in that channel, if you click at the description of the channel at the top, there's a link that explains more about it and how it works
Zephyr
Zephyr•6mo ago
what's your pfp 💀
Unknown User
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
rob
rob•6mo ago
yes hopefully soon
dave
daveOP•6mo ago
am I doing something wrong, or did Azure OpenAI embeddings through AI Gateway break?
No description
dave
daveOP•6mo ago
oh wait I am doing something wrong I think looks like LiteLLM ya, LiteLLM bug. Sorry! https://github.com/BerriAI/litellm/pull/4629
zi_ji_ren
zi_ji_ren•6mo ago
I discovered that VertexAI Claude stream waits for a long time and returns all SSE content at once
dave
daveOP•6mo ago
You have to use ?alt=sse
zi_ji_ren
zi_ji_ren•6mo ago
Thanks! You fixed the problem!
daguang
daguang•6mo ago
Hello, when I use AI Gateway in Worker, will I forward the real request user IP -related information?
Drew Scott
Drew Scott•6mo ago
Would anyone happen to have a working code example for using Vertex AI with AI Gateway in a worker? The docs only show a curl request, and it's not clear to me how to set the url or do authentication using @google-cloud/vertexai -- or if it's even possible.
zzjx
zzjx•6mo ago
How is it judged in AI Gateway when using cache or not? My prompt has only small differences, but is often judged to be cached. Is it possible to set the cache to hit only if the prompts are all consistent?
dave
daveOP•6mo ago
Are you saying you're seeing cache hits when you shouldn't be seeing cache hits?
Want results from more Discord servers?
Add your server