Cloudflare Developers•10mo ago

When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers

When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers everything.

17 Replies

daveOP•10mo ago

@Kathy | AI Gateway PM AI Gateway is forcing v1/ on Vertex AI, even if I pass in v1beta1/

ShreshthTiwari•10mo ago

how to use workers-ai

Victor•10mo ago

Are you talking about how to use workers ai with ai gateway? Or how to use workers ai in the first place?

ShreshthTiwari•10mo ago

idk what that is so i am thinking its some kind of AI service endpoint smth that we can use 😅

Victor•10mo ago

Yes, workers ai is cloudflare's ai service and have their own and (recently added) openai compatible endpoint. However, I would suggest to move further conversation about it to #workers-ai. Also in that channel, if you click at the description of the channel at the top, there's a link that explains more about it and how it works

Zephyr•10mo ago

what's your pfp 💀

Unknown User•9mo ago

Message Not Public

rob•9mo ago

yes hopefully soon

daveOP•9mo ago

am I doing something wrong, or did Azure OpenAI embeddings through AI Gateway break?

daveOP•9mo ago

oh wait I am doing something wrong I think looks like LiteLLM ya, LiteLLM bug. Sorry! https://github.com/BerriAI/litellm/pull/4629

zi_ji_ren•9mo ago

I discovered that VertexAI Claude stream waits for a long time and returns all SSE content at once

daveOP•9mo ago

You have to use ?alt=sse

zi_ji_ren•9mo ago

Thanks! You fixed the problem!

daguang•9mo ago

Hello, when I use AI Gateway in Worker, will I forward the real request user IP -related information?

Drew Scott•9mo ago

Would anyone happen to have a working code example for using Vertex AI with AI Gateway in a worker? The docs only show a curl request, and it's not clear to me how to set the url or do authentication using @google-cloud/vertexai -- or if it's even possible.

zzjx•9mo ago

How is it judged in AI Gateway when using cache or not? My prompt has only small differences, but is often judged to be cached. Is it possible to set the cache to hit only if the prompts are all consistent?

daveOP•9mo ago

Are you saying you're seeing cache hits when you shouldn't be seeing cache hits?

Gaming

Programming

When using AI Gateway with Vertex AI (Gemini 1.5 Pro), it seems like streaming is broken. It buffers

Did you find this page helpful?