JM
JM
RRunPod
Created by Jas on 5/21/2024 in #⛅|pods
"The port is not up yet"
Nope, not even a single GPU card! Nice milestone^^ 🔥
94 replies
RRunPod
Created by Jas on 5/21/2024 in #⛅|pods
"The port is not up yet"
@digigoblin - Correction: I said we are moving towards being 12.1+ for all GPU, it's not fully done yet. - At the moment, we are completely done sunsetting cuda for 11.8 and older. - Currently working on sunsetting cuda 12.0 too, will take a couple weeks to finish 🙂 (we have now less than 4% of GPU on 12.0^^)
94 replies
RRunPod
Created by flowtyone on 3/17/2024 in #⚡|serverless
Didn't get response via email, trying my luck here
Sure
13 replies
RRunPod
Created by flowtyone on 3/17/2024 in #⚡|serverless
Didn't get response via email, trying my luck here
Happy to connect about all considerations you mentionned 🙂
13 replies
RRunPod
Created by flowtyone on 3/17/2024 in #⚡|serverless
Didn't get response via email, trying my luck here
hey @flowtyone
13 replies
RRunPod
Created by Dhruv Mullick on 3/4/2024 in #⛅|pods
Frequent GPU problem with H100
I believe the problem is largelly solved for H100s. We will be looking to automate the script now to expand it to all servers on RunPod. In the mean time, do not hesitate to reach out if you have any question 🙂
23 replies
RRunPod
Created by Dhruv Mullick on 3/4/2024 in #⛅|pods
Frequent GPU problem with H100
So, we got a very good detection tool in place now, but it's manual
23 replies
RRunPod
Created by Dhruv Mullick on 3/4/2024 in #⛅|pods
Frequent GPU problem with H100
@Dhruv Mullick I remembered you sir! 😉
23 replies
RRunPod
Created by Bryan on 3/9/2024 in #⛅|pods
GPU speed getting slower and slower
In my understand all 4090 servers are high quality there, but if not, we have to know which ones to solve this
14 replies
RRunPod
Created by Bryan on 3/9/2024 in #⛅|pods
GPU speed getting slower and slower
Community cloud has definitely variability. For Secure Cloud I am surprised; could you provide pod IDs of 2 GPU where you observe this?
14 replies
RRunPod
Created by Bryan on 3/9/2024 in #⛅|pods
GPU speed getting slower and slower
Hey @Bryan @kopyl
14 replies
RRunPod
Created by Dhruv Mullick on 3/4/2024 in #⛅|pods
Frequent GPU problem with H100
@Dhruv Mullick H100 PCIe have caused us lots of headaches lately. We are soon releasing a very powerful detection tool for the totality of RunPod servers, which will help us fix these non trivial issues. It seems it's always around some specific kernel version that might not be compatible even though it's supposed to be. That being said, expect a strong resolution in the near term!
23 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Yep, engineering has been helping me and Justin very hard lately; new admin features like this one always help so much! Take care sir, let me know if you need anything. Need to go to bed now
46 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
@ashleyk Credited the account! Thanks for helping everyone
46 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Apologies for delay in responding
46 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Btw, I was literally buried in work, I found more hardware for everyone
46 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
That's no good, thanks for explaining
46 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Uh
46 replies
RRunPod
Created by Jidovenok on 2/21/2024 in #⚡|serverless
All 27 workers throttled
Sure, thanks both!
239 replies
RRunPod
Created by Jidovenok on 2/21/2024 in #⚡|serverless
All 27 workers throttled
Hey @marshall @HyS | The World of Ylvera @ashleyk I onboarded a huge load of hardware. However, the minimum RunPod should be able to do, is provide high quality communication, which I see wasn't ideal. Zhen, Pardeep, Justin and me have been pushing hard on at least 5 different features to make Serverless much better at managing huge loads. Secondly, we hired 3 support staff, 2 cloud engineers, and looking for more support engineers as well. Communications must improve; and it will, trust me. That being said, we value relationship above all else. All else. Hit me up in private and we will provide compensation for you.
239 replies