R
RunPod3w ago
ammar

Ollama serverless?

is thaty any easy way to run ollama over serverless?
4 Replies
Dj
Dj3w ago
GitHub
GitHub - SvenBrnn/runpod-worker-ollama: A serverless ollama worker ...
A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.
Yebs
Yebs3w ago
is it as good as the runpod vllm template? in terms of performance and concurrecy stuff
Dj
Dj3w ago
I haven't tested it personally but I can only assume so? I can give it a try for you in the morning if you don't get to test it out tonight.
SvenBrnn
SvenBrnn3w ago
no its most likely not, there is no cache implemented that vllm can use so startup will take a bit longer as vllm the wrapper is just starting a ollama inside and is wrapping requests to runpod so ollama can understand and answer them. Its also nice for gguf models or models from official ollama repository however its just a small project. I just added automatically updating the container today, so there should always ne a new container for new ollama versions ready after max 24h now It will however fully work with all endpoints including the openapi endpoints

Did you find this page helpful?