RunPod•4w ago

Rag on serverless LLM

I am running a server less LLM. I want to add to a model a series of pdf files to augment the model. I can do it on webui in a dedicated gpu by adding knowledge

3 Replies

Jason•4w ago

Okay, any questions?

ericmsilverOP•4w ago

Is it possible to do? So here is where I am at. I am looking to run webui in the cloud. I want to do rag via web UI. This works now by using an existing template in run pod. Knowledge works great so rag works as well. Ultimately I want to run webui on a dedicated machine but the end goal is to use a server less endpoint and connect to it from a webui server and make the call to the server less end point. I can connect to the endpoint in Python and works good. Does this concept make sense? More progress. Adding connection to webui to serverless worked. It now is able to use the serverless server to query the model.

Jason•3w ago

yes but its not too easy like immediately setting up an endpoint only, you need a tool to combine them like langchain + the models endpoint openwebui works too just need the embedding

Gaming

Programming

Rag on serverless LLM

Did you find this page helpful?