!x.com/dominicfrei
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Or would you know of any?
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Like any decent unceonsred RP capable model it seems.
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
I wish there was anyone out there, offering a service that's running https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2 or https://huggingface.co/Undi95/Llama3-Unholy-8B-OAS but it seems like I have to host them myself. 😄
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
That's what I'm currently considering. Building the image myself using tabbyAPI and https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
So, I got vLLM running locally now to test it out and see if that's an option. The results are really cool, but not the right approach for me. 😄
100 completions in about 100s is amazing for the total result. But the individual completion is too slow. I wonder if tabbyAPI (with sequential requests) and multiple parallel workers might actually be better. I expect no more than 2-3 requests at the same time for now. And no more than 5 for a while (assumptions can be wrong of course).
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
I guess I have to try that next to get to my goal. haha
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Thank you!
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Great. Wasn't obvious to me that it's that easy. 😄
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?

408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Ah, cool! Can you point me to the documentation about how that works on runpod and how to get started? 🙂
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
@Papa Madiator Maybe you can clarify? 🙂
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
But I don't have access to the underlying pod in serverless, I can only deploy templates provided in runpod (if I understand correctly).
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
You always know everything, @Wolfsauge 😍
Unfortunately, the serverless config (https://github.com/runpod-workers/worker-vllm) is vLLM 0.3, so what can we do about that? 🤔
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Also, should I create new threads for those questions? We've been drifting away from the original topic quite a bit. 😄
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Not quite there yet but I'm working on it. But maybe you can help me with that question. I sometimes get 2min cold boots. What's your advice on workers (active, etc.) and how to make sure that cold boots never take long. 🙂
Do I need to keep active workers at 1? How does the serveless vLLM template parallel requests?
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
cc @drycoco
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?

408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
Well, we're not quite there yet. 😉
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
And how to use the model I want to use, which is not compatible with vLLM actually. 😄
408 replies
RRunPod
•Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods-clusters
Maintenance - only a Community Cloud issue?
And more importantly, how it handles paralle requests.
408 replies