R
RunPod•7mo ago
manan4884

Ollama API

Hello, I am trying to host LLMs on runpod gpu-cloud using Ollama (https://ollama.com/download). I want to set it up as an endpoint so I can access it from my local laptop, using Python libraries like Langchain. I'm having trouble setting up the API endpoint, anyone worked with this before?
11 Replies
justin
justin•7mo ago
Yeah just run the install script they tell u to, then u can do ollama serve in one terminal and ollama run (model name) in another
manan4884
manan4884•7mo ago
How can I make api calls to it though? is there a template that exposes it on a port
ashleyk
ashleyk•7mo ago
You have to create your own, you can start with the PyTorch template as a base
manan4884
manan4884•7mo ago
Do you know how I could expose it to an IP and port? I'm stuck with that part
ashleyk
ashleyk•7mo ago
Either add HTTP port to your pod or ensure your pod has a public IP and add a TCP port and then use the public IP and public port mapping under the Connect button.
manan4884
manan4884•7mo ago
I see okay, do you know the commands to add this ip to Ollama?
justin
justin•7mo ago
Follow this tutorial https://discord.com/channels/912829806415085598/1207214335605088266 launch a flask app process ur incoming api requests through flasks Actually Im guessing Ollama launches a backend locally so wherever that port is, you can just bind to that port directly and then send the API requests there since Ollama supports API requests
manan4884
manan4884•7mo ago
Thank you so much, I'll have a look 😄
justin
justin•7mo ago
@manan4884 https://discord.com/channels/912829806415085598/1207848538629742623 Enough people seem to be getting into Ollama so i wrote a simple setup instructions on binding it etc
manan4884
manan4884•7mo ago
Thanks alot for the help!
Want results from more Discord servers?
Add your server