RunPod•3w ago

Ollama serverless?

is thaty any easy way to run ollama over serverless?

4 Replies

Dj•3w ago

This user has a great template https://github.com/SvenBrnn/runpod-worker-ollama :) https://hub.docker.com/r/svenbrnn/runpod-ollama

GitHub

GitHub - SvenBrnn/runpod-worker-ollama: A serverless ollama worker ...

A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.

Yebs•3w ago

is it as good as the runpod vllm template? in terms of performance and concurrecy stuff

Dj•3w ago

I haven't tested it personally but I can only assume so? I can give it a try for you in the morning if you don't get to test it out tonight.

SvenBrnn•3w ago

no its most likely not, there is no cache implemented that vllm can use so startup will take a bit longer as vllm the wrapper is just starting a ollama inside and is wrapping requests to runpod so ollama can understand and answer them. Its also nice for gguf models or models from official ollama repository however its just a small project. I just added automatically updating the container today, so there should always ne a new container for new ollama versions ready after max 24h now It will however fully work with all endpoints including the openapi endpoints

Gaming

Programming

Ollama serverless?

Did you find this page helpful?