R
RunPod4mo ago
Laikh

Unable to start pod with MI300x

Observing "hang" when starting pod with 8xMI300x, screenshot attached. Any ideas on how to fix this?
No description
3 Replies
yhlong00000
yhlong000004mo ago
I am able to run with 8xMI300X using official templates, i am wondering if something related to your image?
nerdylive
nerdylive4mo ago
Whats your dockerfile like? how do you call the cmd / entrypoint
Laikh
Laikh4mo ago
Gotcha -- this was the image that was used: https://hub.docker.com/r/eliovp/rocm6.1.2_py3.10_torch2.5_vllm0.5_bkc Using the official rocm pytorch images from runpod seems to work. Thanks!
Want results from more Discord servers?
Add your server