RunPod•9mo ago

Llama 3.1 via Ollama

You can now use the tutorial on running Ollama on serverless environments (https://docs.runpod.io/tutorials/serverless/cpu/run-ollama-inference) in combination with Llama 3.1. We have tested this with Llama 3.1 8B, using a network volume and a 24 GB GPU PRO. Please let us know if this setup also works with other weights and GPUs.

Run an Ollama Server on a RunPod CPU | RunPod Documentation

Learn to set up and run an Ollama server on RunPod CPU for inference with this step-by-step tutorial.

15 Replies

PatrickR•9mo ago

Docs on that Docker image are now updated. Thanks for the ping!

NERDDISCOOP•9mo ago

@PatrickR thank you very much!

Madiator2011•9mo ago

#Better Ollama - CUDA12 works with gpu

aurelium•9mo ago

When you say "In the Container Start Command field, specify the Ollama supported model", do you mean literally just pasting the ollama model ID into that field?

PatrickR•9mo ago

Yes. Like orca-mini or llama3.1 Also, the Docker image just updated to version 0.9 pooyaharatian/runpod-ollama:0.0.9

aurelium•9mo ago

I keep getting JSON decoding errors trying to run queries on it...

PatrickR•9mo ago

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

Are you passing this?

aurelium•9mo ago

Yeah:

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

request:

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

PatrickR•9mo ago

downgrade the docker image to 0.0.7

NERDDISCOOP•9mo ago

I also see this error for 0.0.9, so please use 0.0.8, as that one is working. I opened https://github.com/pooyahrtn/RunpodOllama/issues/11 to get this fixed.

GitHub

0.0.9 is broken · Issue #11 · pooyahrtn/RunpodOllama

When using the 0.0.9 of this image, we receive this error: { "delayTime": 14006, "error": "{"error_type": "<class 'requests.exceptions.JSONDecodeEr...

NERDDISCOOP•9mo ago

Yes, like this:

aurelium•9mo ago

That works, thanks!

NERDDISCOOP•9mo ago

Perfect, have fun! Would you mind updating the version in the tutorial as well? And go back to 0.0.8? 🙏

PatrickR•9mo ago

Reverted in the docs

NERDDISCOOP•9mo ago

Thank you very much!

Gaming

Programming

Llama 3.1 via Ollama

Did you find this page helpful?