zilli Comments - Answer Overflow

zilli

•Created by zilli on 1/2/2025 in #⚡｜serverless

Too big requests for serverless infinity vector embedding cause errors

...and the build failed because the hardcoded nightly version of pytorch (for the end of life CUDA 12.1.0) is unavailable 😅 I'm rebuilding after updating dependencies, but the PR won't be just the string length change.

7 replies

RRunPod

•Created by zilli on 1/2/2025 in #⚡｜serverless

Too big requests for serverless infinity vector embedding cause errors

I finally remembered to start building the docker image (the last one took 2.5 hours...). I'll try it out tomorrow, and if it works, put in that PR

7 replies

RRunPod

•Created by Hello on 1/2/2025 in #⚡｜serverless

Job response not loading

Open the browser console, are there any errors? Then try refreshing the page

5 replies

RRunPod

•Created by Nelson on 1/2/2025 in #⚡｜serverless

Serverless SGLang - 128 max token limit problem.

I don't know how to change the configuration.,,, I've tried to set the following environment variables with higher numbers than 128 with now luck ...

In the configuration, did you click the save button at the bottom after changing the value? If you did, did you also remove your serverless workers afterward, and allow them to initialize again with your new settings?

19 replies

RRunPod

•Created by zilli on 1/2/2025 in #⚡｜serverless

Too big requests for serverless infinity vector embedding cause errors

Yep, that's what I started doing, but it's hard to come close to the memory limits of the GPU with a cap of 8192 items

7 replies

Gaming

Programming