jackson hole
RRunPod
•Created by jackson hole on 1/8/2025 in #⚡|serverless
Some basic confusion about the `handlers`
So, basically this FastAPI is implemented in this async way and I guess it should do that job!
9 replies
RRunPod
•Created by jackson hole on 1/8/2025 in #⚡|serverless
Some basic confusion about the `handlers`
Actually my code structure looks like this:
And that calls the appropriate async function:
9 replies
RRunPod
•Created by jackson hole on 1/8/2025 in #⚡|serverless
Some basic confusion about the `handlers`
Yes, that handler. So I am basically utilizing that already!? Wow.
9 replies
RRunPod
•Created by jackson hole on 1/7/2025 in #⚡|serverless
How to monitor the LLM inference speed (generation token/s) with vLLM serverless endpoint?
Absolutely, but I found Discord (and nerdylive support) faster and quicker 😉
7 replies
RRunPod
•Created by jackson hole on 1/7/2025 in #⚡|serverless
How to monitor the LLM inference speed (generation token/s) with vLLM serverless endpoint?
Oh yeah, I thought runpod has built-in support for this. Thanks
7 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
Absolutely mate
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
Thanks a lot -- looking forward to implementing these soon ✌🏻
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
I see, that will basically replace our "authentication server layer".
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
Fabulous. Thanks. ✨
One thing...
Generally the security is on our end which we need to decide. I mean, how do we want to proceed with authentication.
There are several options like:
- Basic authentication (sending uname-pass in header -- least secure)
- Some dynamic token -- encrypt with SHA and that sort of stuff
- Create API key per user account (just like OpenAI) and use that etc...
Let's say we have selected any of the techniques, then, is there any predefined framework that we can use or, do we need to code these logic from scratch?
I have heard of "AWS API Gateway" but not sure about its relevance.
We are using FastAPI as our HTTP request handler and that will sent the request to the runpod for context. So, the question: Should we write the authentication logic, or are there libraries/services that can do these for us?
Thanks mate
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
Alrighty, then I guess I should go ahead with that visualization.
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
If basic setup is good enough, then okay, otherwise you may guide more, thanks/.
20 replies
RRunPod
•Created by jackson hole on 1/3/2025 in #⚡|serverless
How is the architecture set up in the serverless (please give me a minute to explain myself)
Damn, the visualization (the image attached -- if that's what you meant -- was just to grab attention -- 😅 ) removed. The question is rather "an ask for guidence" on the standard architecture design while deploying the LLMs with authentication.
20 replies