JorgeG
RRunPod
•Created by JorgeG on 3/1/2024 in #⚡|serverless
Worker is very frequently killed and replaced
1hdfqkkbw41swp
Thanks for looking into it
5 replies
RRunPod
•Created by JorgeG on 1/15/2024 in #⚡|serverless
Worker handling multiple requests concurrently
Thanks @flash-singh . I did search but didn't return any results.
Tried different keywords, now I got one post that points me towards this: https://github.com/runpod-workers/worker-vllm/blob/main/src/handler.py
So I guess the magic bit is the "concurrency_modifier" arg in serverless start.
FYI, this argument is not documented anywhere in the runpod.io docs, at least I couldn't find it.
11 replies
RRunPod
•Created by JorgeG on 1/11/2024 in #⚡|serverless
Use private image from Google Cloud Artifact Registry
thanks for the help!
16 replies
RRunPod
•Created by JorgeG on 1/11/2024 in #⚡|serverless
Use private image from Google Cloud Artifact Registry
the non-base 64 worked 🤷♂️
the number of chars the json file is 2386
the number of chars in the b64-encoded file is 3185
according to wc -m <file>
(I noticed just now that I made a typo in my first message. its 3185 chars, not 1385)
still doesnt make sense if the max length is 4000 according to @zacksparrow
16 replies
RRunPod
•Created by JorgeG on 1/11/2024 in #⚡|serverless
Use private image from Google Cloud Artifact Registry
that would be great, thanks for the quick response. any timeline on that?
16 replies
RRunPod
•Created by JorgeG on 1/11/2024 in #⚡|serverless
Use private image from Google Cloud Artifact Registry
well, technically this is still docker login. You can use the docker login command to authenticate with google cloud, but it requires the huge key.
16 replies