norefreshing Posts - Answer Overflow

norefreshing

•Created by norefreshing on 2/22/2024 in #⛅｜pods

Too many failed requests

Hello. I've tried to run casperhansen/mixtral-instruct-awq (https://huggingface.co/casperhansen/mixtral-instruct-awq) on A100 80 GB and A100 SXM 80GB GPUs, sending 10 requests per second using this script https://github.com/vllm-project/vllm/blob/main/benchmarks/benchmark_serving.py. However most of the requests failed with Aborted request log from vLLM. This issue didn't occur on another platform with the same GPU, and same code, so I'm not sure if the problem is with vLLM or with RunPod's internal processing. Could anyone provide guidance on what the cause might be?

7 replies

RRunPod

•Created by norefreshing on 2/6/2024 in #⛅｜pods

How can I use ollama Docker image?

Hello. I've been trying to serve ollama on RunPod using ollama Docker image (https://hub.docker.com/r/ollama/ollama) but haven't found a way to run it. I tried using the docker run ... command in the Container Start Command input but I encountered an error: unknown command "docker" for "ollama". Does anyone know the correct method to use ollama on RunPod?

12 replies

Gaming

Programming