RunPod•12mo ago

Ollama API

Hello, I am trying to host LLMs on runpod gpu-cloud using Ollama (https://ollama.com/download). I want to set it up as an endpoint so I can access it from my local laptop, using Python libraries like Langchain. I'm having trouble setting up the API endpoint, anyone worked with this before?

11 Replies

justin•12mo ago

Yeah just run the install script they tell u to, then u can do ollama serve in one terminal and ollama run (model name) in another

manan4884OP•12mo ago

How can I make api calls to it though? is there a template that exposes it on a port

ashleyk•12mo ago

You have to create your own, you can start with the PyTorch template as a base

manan4884OP•12mo ago

Do you know how I could expose it to an IP and port? I'm stuck with that part

ashleyk•12mo ago

Either add HTTP port to your pod or ensure your pod has a public IP and add a TCP port and then use the public IP and public port mapping under the Connect button.

manan4884OP•12mo ago

I see okay, do you know the commands to add this ip to Ollama?

justin•12mo ago

https://docs.runpod.io/pods/configuration/expose-ports

Expose ports | RunPod Documentation

Learn to expose your ports.

justin•12mo ago

Follow this tutorial https://discord.com/channels/912829806415085598/1207214335605088266 launch a flask app process ur incoming api requests through flasks Actually Im guessing Ollama launches a backend locally so wherever that port is, you can just bind to that port directly and then send the API requests there since Ollama supports API requests

manan4884OP•12mo ago

Thank you so much, I'll have a look 😄

justin•12mo ago

@manan4884 https://discord.com/channels/912829806415085598/1207848538629742623 Enough people seem to be getting into Ollama so i wrote a simple setup instructions on binding it etc

manan4884OP•12mo ago

Thanks alot for the help!

Gaming

Programming

Ollama API

Did you find this page helpful?