!x.com/dominicfrei
!x.com/dominicfrei
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Or would you know of any?
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Like any decent unceonsred RP capable model it seems.
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
I wish there was anyone out there, offering a service that's running https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2 or https://huggingface.co/Undi95/Llama3-Unholy-8B-OAS but it seems like I have to host them myself. 😄
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
That's what I'm currently considering. Building the image myself using tabbyAPI and https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
So, I got vLLM running locally now to test it out and see if that's an option. The results are really cool, but not the right approach for me. 😄
============================= test session starts =============================
collecting ... collected 1 item

test_model_wrapper.py::test_complete

======================== 1 passed in 100.25s (0:01:40) ========================
PASSED [100%]

Time to set up: 0.1114599 seconds


Runs: [31.6107654, 31.6132964, 34.2333528, 34.2130582, 34.2042625, 34.2340227, 34.2474756, 34.2105221, 34.2394911, 34.2595923, 34.2379803, 34.2343132, 34.2297294, 34.2636501, 34.223144, 34.2007176, 34.2261261, 34.1895347, 34.2410051, 34.2694363, 34.2138979, 34.219749, 34.2380913, 34.2763124, 34.2519246, 34.24931, 34.2471614, 34.2769107, 42.0093234, 42.0015118, 42.0404314, 42.0552968, 42.0382447, 42.0084274, 42.0031958, 42.054302, 42.002341, 42.0075319, 42.0172066, 47.9992168, 47.953934, 47.9645648, 48.0265022, 48.0249268, 47.951652, 48.0254267, 47.9690977, 49.3527739, 49.3451308, 50.7696567, 50.767339, 50.7534204, 50.775435, 50.759712, 50.7630917, 50.7589791, 50.7768211, 53.4804957, 53.4305316, 56.1890333, 56.1694433, 56.1409132, 56.1523421, 56.1788844, 56.1737424, 56.1204063, 56.1047533, 56.1247003, 56.1293441, 56.1116698, 56.1414604, 57.3100886, 57.2900195, 57.3443251, 58.5790726, 59.776073, 59.6932115, 59.7322922, 59.7565171, 59.7780871, 59.7480857, 60.7950622, 63.1008595, 62.9985082, 63.0248589, 63.0549046, 64.3303572, 64.3385164, 64.3284183, 68.3025694, 68.2875625, 69.6675595, 69.6902716, 70.5738979, 70.6244101, 70.5820796, 71.4233821, 78.2565978, 81.2923646, 99.9321581]


Average run time: 49.985901578 seconds
============================= test session starts =============================
collecting ... collected 1 item

test_model_wrapper.py::test_complete

======================== 1 passed in 100.25s (0:01:40) ========================
PASSED [100%]

Time to set up: 0.1114599 seconds


Runs: [31.6107654, 31.6132964, 34.2333528, 34.2130582, 34.2042625, 34.2340227, 34.2474756, 34.2105221, 34.2394911, 34.2595923, 34.2379803, 34.2343132, 34.2297294, 34.2636501, 34.223144, 34.2007176, 34.2261261, 34.1895347, 34.2410051, 34.2694363, 34.2138979, 34.219749, 34.2380913, 34.2763124, 34.2519246, 34.24931, 34.2471614, 34.2769107, 42.0093234, 42.0015118, 42.0404314, 42.0552968, 42.0382447, 42.0084274, 42.0031958, 42.054302, 42.002341, 42.0075319, 42.0172066, 47.9992168, 47.953934, 47.9645648, 48.0265022, 48.0249268, 47.951652, 48.0254267, 47.9690977, 49.3527739, 49.3451308, 50.7696567, 50.767339, 50.7534204, 50.775435, 50.759712, 50.7630917, 50.7589791, 50.7768211, 53.4804957, 53.4305316, 56.1890333, 56.1694433, 56.1409132, 56.1523421, 56.1788844, 56.1737424, 56.1204063, 56.1047533, 56.1247003, 56.1293441, 56.1116698, 56.1414604, 57.3100886, 57.2900195, 57.3443251, 58.5790726, 59.776073, 59.6932115, 59.7322922, 59.7565171, 59.7780871, 59.7480857, 60.7950622, 63.1008595, 62.9985082, 63.0248589, 63.0549046, 64.3303572, 64.3385164, 64.3284183, 68.3025694, 68.2875625, 69.6675595, 69.6902716, 70.5738979, 70.6244101, 70.5820796, 71.4233821, 78.2565978, 81.2923646, 99.9321581]


Average run time: 49.985901578 seconds
100 completions in about 100s is amazing for the total result. But the individual completion is too slow. I wonder if tabbyAPI (with sequential requests) and multiple parallel workers might actually be better. I expect no more than 2-3 requests at the same time for now. And no more than 5 for a while (assumptions can be wrong of course).
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
I guess I have to try that next to get to my goal. haha
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Thank you!
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Great. Wasn't obvious to me that it's that easy. 😄
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
No description
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Ah, cool! Can you point me to the documentation about how that works on runpod and how to get started? 🙂
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
@Papa Madiator Maybe you can clarify? 🙂
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
But I don't have access to the underlying pod in serverless, I can only deploy templates provided in runpod (if I understand correctly).
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
You always know everything, @Wolfsauge 😍 Unfortunately, the serverless config (https://github.com/runpod-workers/worker-vllm) is vLLM 0.3, so what can we do about that? 🤔
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Also, should I create new threads for those questions? We've been drifting away from the original topic quite a bit. 😄
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Not quite there yet but I'm working on it. But maybe you can help me with that question. I sometimes get 2min cold boots. What's your advice on workers (active, etc.) and how to make sure that cold boots never take long. 🙂 Do I need to keep active workers at 1? How does the serveless vLLM template parallel requests?
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
cc @drycoco
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
No description
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
Well, we're not quite there yet. 😉
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
And how to use the model I want to use, which is not compatible with vLLM actually. 😄
408 replies
RRunPod
Created by !x.com/dominicfrei on 5/2/2024 in #⛅|pods
Maintenance - only a Community Cloud issue?
And more importantly, how it handles paralle requests.
408 replies