zilli
RRunPod
•Created by zilli on 1/2/2025 in #⚡|serverless
Too big requests for serverless infinity vector embedding cause errors
...and the build failed because the hardcoded nightly version of pytorch (for the end of life CUDA 12.1.0) is unavailable 😅
I'm rebuilding after updating dependencies, but the PR won't be just the string length change.
7 replies
RRunPod
•Created by zilli on 1/2/2025 in #⚡|serverless
Too big requests for serverless infinity vector embedding cause errors
I finally remembered to start building the docker image (the last one took 2.5 hours...). I'll try it out tomorrow, and if it works, put in that PR
7 replies
RRunPod
•Created by Hello on 1/2/2025 in #⚡|serverless
Job response not loading
Open the browser console, are there any errors? Then try refreshing the page
5 replies
RRunPod
•Created by Nelson on 1/2/2025 in #⚡|serverless
Serverless SGLang - 128 max token limit problem.
I don't know how to change the configuration.,,, I've tried to set the following environment variables with higher numbers than 128 with now luck ...In the configuration, did you click the save button at the bottom after changing the value? If you did, did you also remove your serverless workers afterward, and allow them to initialize again with your new settings?
19 replies
RRunPod
•Created by zilli on 1/2/2025 in #⚡|serverless
Too big requests for serverless infinity vector embedding cause errors
Yep, that's what I started doing, but it's hard to come close to the memory limits of the GPU with a cap of 8192 items
7 replies