no compatible serverless GPUs found while following tutorial steps

hi, i'm trying to run orca-mini on serverless by following this tutorial [https://docs.runpod.io/tutorials/serverless/cpu/run-ollama-inference]. whenever the download finishes, i get the error message below and then the ckpt download resstarts.
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info][GIN] 2025/01/07 - 22:02:45 | 200 | 4.060412ms | 127.0.0.1 | HEAD "/"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:45.001Z level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="4.4 GiB" available="4.2 GiB"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:45.001Z level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:44.975Z level=INFO source=gpu.go:205 msg="looking for compatible GPUs"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:44.975Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60102]"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info][GIN] 2025/01/07 - 22:02:45 | 200 | 4.060412ms | 127.0.0.1 | HEAD "/"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:45.001Z level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="4.4 GiB" available="4.2 GiB"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:45.001Z level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:44.975Z level=INFO source=gpu.go:205 msg="looking for compatible GPUs"\n
2025-01-07 22:02:53.719[1vt59v6j5ku3yh][info]time=2025-01-07T22:02:44.975Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60102]"\n
3 Replies
tzar_impersonator
i'm using 2vCPUs with 5GB of storage space update - having moved to 4vCPUs, i instead get this error message in the logs when i send a request
2025-01-07T22:27:28.942725265Z verifying sha256 digest
2025-01-07T22:27:28.942728975Z writing manifest
2025-01-07T22:27:28.942730835Z removing any unused layers
2025-01-07T22:27:28.942733695Z success
2025-01-07T22:27:31.218137093Z --- Starting Serverless Worker | Version 1.6.2 ---
2025-01-07T22:27:31.511124984Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Started.", "level": "INFO"}
2025-01-07T22:27:31.512377468Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Captured Handler Exception", "level": "ERROR"}
2025-01-07T22:27:31.512648227Z {"requestId": null, "message": "{\n \"error_type\": \"<class 'KeyError'>\",\n \"error_message\": \"'input'\",\n \"error_traceback\": \"Traceback (most recent call last):\\n File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n handler_return = handler(job)\\n File \\\"//runpod_wrapper.py\\\", line 26, in handler\\n input[\\\"input\\\"][\\\"stream\\\"] = False\\nKeyError: 'input'\\n\",\n \"hostname\": \"j2fos5d965rol9-6441166f\",\n \"worker_id\": \"j2fos5d965rol9\",\n \"runpod_version\": \"1.6.2\"\n}", "level": "ERROR"}
2025-01-07T22:27:31.669738576Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Finished.", "level": "INFO"}
2025-01-07T22:27:28.942725265Z verifying sha256 digest
2025-01-07T22:27:28.942728975Z writing manifest
2025-01-07T22:27:28.942730835Z removing any unused layers
2025-01-07T22:27:28.942733695Z success
2025-01-07T22:27:31.218137093Z --- Starting Serverless Worker | Version 1.6.2 ---
2025-01-07T22:27:31.511124984Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Started.", "level": "INFO"}
2025-01-07T22:27:31.512377468Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Captured Handler Exception", "level": "ERROR"}
2025-01-07T22:27:31.512648227Z {"requestId": null, "message": "{\n \"error_type\": \"<class 'KeyError'>\",\n \"error_message\": \"'input'\",\n \"error_traceback\": \"Traceback (most recent call last):\\n File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n handler_return = handler(job)\\n File \\\"//runpod_wrapper.py\\\", line 26, in handler\\n input[\\\"input\\\"][\\\"stream\\\"] = False\\nKeyError: 'input'\\n\",\n \"hostname\": \"j2fos5d965rol9-6441166f\",\n \"worker_id\": \"j2fos5d965rol9\",\n \"runpod_version\": \"1.6.2\"\n}", "level": "ERROR"}
2025-01-07T22:27:31.669738576Z {"requestId": "2e61a451-6bb4-4fbc-b83b-415bfa5423e6-e1", "message": "Finished.", "level": "INFO"}
update 2 - this was caused by my not following the tutorial properly. i wasn't using the appropriate request the latest weird stuff that i've seen was this response to a request
{
"delayTime": 7579,
"error": "model \"orca-mini:3b\" not found, try pulling it first",
"executionTime": 2749,
"id": "2858b743-fc92-43df-bc93-4ad6cb67ca25-e1",
"status": "FAILED",
"workerId": "j2fos5d965rol9"
}
{
"delayTime": 7579,
"error": "model \"orca-mini:3b\" not found, try pulling it first",
"executionTime": 2749,
"id": "2858b743-fc92-43df-bc93-4ad6cb67ca25-e1",
"status": "FAILED",
"workerId": "j2fos5d965rol9"
}
nerdylive
nerdylive3w ago
did you input any model ? im not sure how it works, how it should work did you do this 2: In the Container Start Command field, specify the Ollama supported model, such as orca-mini or llama3.1. Allocate sufficient container disk space for your model. Typically, 20 GB should suffice for most models. (optional) In Enviroment Variables set a new key to OLLAMA_MODELS and its value to /runpod-volume. This will allow the model to be stored to your attached volume. maybe the model name is wrong
tzar_impersonator
hey, thanks for your help. my best guess is that the pod hadn't initialised properly or something when i ran it the first time. it started working perfectly on the 2nd try

Did you find this page helpful?