antoniog
RRunPod
•Created by JorgeG on 1/15/2024 in #⚡|serverless
Worker handling multiple requests concurrently
It also seems that
concurrency_modifier
doesn't work in this example. Please see this issue: https://github.com/runpod-workers/worker-vllm/issues/3611 replies
RRunPod
•Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
I'm not sure but you should probably change it in the Dockerfile. Setting it as an env variable probably won't work. (I may be wrong.)
76 replies
RRunPod
•Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
One more thing. It's recommended to use CUDA verson of 12.1. Try to change it by setting env variable
WORKER_CUDA_VERSION
to 12.176 replies
RRunPod
•Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
Hey! I had a similar issue with loading awq models with this worker. I resolved it by setting
GPU_MEMORY_UTILIZATION
variable to 0.90.76 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Hey @Justin and @Alpay Ariyak ! I just tried the latest version of worker-vllm, and there's still an issue related to concurrent requests. The problem is that
MAX_CONCURRENCY
doesn't seem to work. See here: https://github.com/runpod-workers/worker-vllm/issues/3613 replies
RRunPod
•Created by antoniog on 12/22/2023 in #⚡|serverless
Issue with Request Count Scale Type
got it, thanks
11 replies
RRunPod
•Created by antoniog on 12/20/2023 in #⚡|serverless
Issues with building the new `worker-vllm` Docker Image
thanks!
6 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
Yes, it's the same issue with the original vllm-fork-for-sls-worker too. I opened an issue: https://github.com/runpod-workers/worker-vllm/issues/25
9 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
I'm not sure but it may be related to the small modifications in the
vllm-fork-for-sls-worker
. I'll try to build the image with the original vllm-fork-for-sls-worker
.9 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
I'm getting error when bulding a Docker Image. See below.
9 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
there was another issue, so I couldn't build the image...
9 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
thanks!
9 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
I believe changing
https://github.com/runpod/[email protected]#egg=vllm;
in Dockerfile to https://github.com/runpod/[email protected]#egg=vllm
should work?13 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Is it possible to use a different version of vllm, e.g
0.2.2
?13 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Thanks!
13 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
It's probably related to the new worker, right? I asked about the previous one.
13 replies
RRunPod
•Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Hey!
Yes, I have opened an issue in the repo: https://github.com/runpod-workers/worker-vllm/issues/22
Nope, it can't 😦
13 replies