antoniog
antoniog
RRunPod
Created by JorgeG on 1/15/2024 in #⚡|serverless
Worker handling multiple requests concurrently
It also seems that concurrency_modifier doesn't work in this example. Please see this issue: https://github.com/runpod-workers/worker-vllm/issues/36
11 replies
RRunPod
Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
I'm not sure but you should probably change it in the Dockerfile. Setting it as an env variable probably won't work. (I may be wrong.)
76 replies
RRunPod
Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
One more thing. It's recommended to use CUDA verson of 12.1. Try to change it by setting env variable WORKER_CUDA_VERSION to 12.1
76 replies
RRunPod
Created by Concept on 1/15/2024 in #⚡|serverless
Rundpod VLLM Cuda out of Memory
Hey! I had a similar issue with loading awq models with this worker. I resolved it by setting GPU_MEMORY_UTILIZATION variable to 0.90.
76 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Hey @Justin and @Alpay Ariyak ! I just tried the latest version of worker-vllm, and there's still an issue related to concurrent requests. The problem is that MAX_CONCURRENCY doesn't seem to work. See here: https://github.com/runpod-workers/worker-vllm/issues/36
13 replies
RRunPod
Created by antoniog on 12/22/2023 in #⚡|serverless
Issue with Request Count Scale Type
got it, thanks
11 replies
RRunPod
Created by antoniog on 12/20/2023 in #⚡|serverless
Issues with building the new `worker-vllm` Docker Image
thanks!
6 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
Yes, it's the same issue with the original vllm-fork-for-sls-worker too. I opened an issue: https://github.com/runpod-workers/worker-vllm/issues/25
9 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
I'm not sure but it may be related to the small modifications in the vllm-fork-for-sls-worker. I'll try to build the image with the original vllm-fork-for-sls-worker.
9 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
I'm getting error when bulding a Docker Image. See below.
1.527 Running command git clone --filter=blob:none --quiet https://github.com/antonioglass/vllm-fork-for-sls-worker.git /src/vllm
4.184 Resolved https://github.com/antonioglass/vllm-fork-for-sls-worker.git to commit 9797380bd9a3eef6a70e899d03b81f6967ecd287
4.192 Installing build dependencies: started
68.88 Installing build dependencies: still running...
131.1 Installing build dependencies: still running...
199.6 Installing build dependencies: still running...
964.9 Installing build dependencies: finished with status 'error'
965.0 error: subprocess-exited-with-error
965.0
965.0 × pip subprocess to install build dependencies did not run successfully.
965.0 │ exit code: 2
1.527 Running command git clone --filter=blob:none --quiet https://github.com/antonioglass/vllm-fork-for-sls-worker.git /src/vllm
4.184 Resolved https://github.com/antonioglass/vllm-fork-for-sls-worker.git to commit 9797380bd9a3eef6a70e899d03b81f6967ecd287
4.192 Installing build dependencies: started
68.88 Installing build dependencies: still running...
131.1 Installing build dependencies: still running...
199.6 Installing build dependencies: still running...
964.9 Installing build dependencies: finished with status 'error'
965.0 error: subprocess-exited-with-error
965.0
965.0 × pip subprocess to install build dependencies did not run successfully.
965.0 │ exit code: 2
9 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
there was another issue, so I couldn't build the image...
9 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
How to build worker-vllm Docker Image without a model inside?
thanks!
9 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
I believe changing https://github.com/runpod/[email protected]#egg=vllm; in Dockerfile to https://github.com/runpod/[email protected]#egg=vllm should work?
13 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Is it possible to use a different version of vllm, e.g 0.2.2?
13 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Thanks!
13 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
It's probably related to the new worker, right? I asked about the previous one.
13 replies
RRunPod
Created by antoniog on 12/19/2023 in #⚡|serverless
Issue with worker-vllm and multiple workers
Hey! Yes, I have opened an issue in the repo: https://github.com/runpod-workers/worker-vllm/issues/22 Nope, it can't 😦
13 replies