Assistance Requested for Pod Initialization Issue
Hello @SupportTeam and RunPod community,
I'm reaching out regarding an issue with starting pods that seems to be a common problem for several users here. Like others, my pod is experiencing prolonged "image pull pending" and initiation loops, making it impossible to use the service effectively.
I reached out to chat support and sadly the chat was abandoned by the support member.
To abide by the post guidelines and to keep certain details private, I have not included my Pod ID in this message. However, I'm ready to provide this information through a more secure method if needed.
I would greatly appreciate any updates from the support team on this matter or further instructions on how to proceed. If more detailed information is necessary for troubleshooting, I'm available to continue this discussion in DMs with a support representative.
Thank you for your help, and I look forward to resolving this with your guidance.
Kind regards,
Satya Dae (Username: Satya Dae)
Solution:Jump to solution
Official Ubuntu template or Docker image? You should use the RunPod H100 PyTorch image, its the only one that has Torch custom built for H100. Others will not work properly.
8 Replies
Secure Cloud or Community Cloud?
Which region?
Which GPU type?
Secure Cloud and I chose "Any" for the region, so unsure. Nothing else was available. When I tried to select a region it would switch to unavailable unless I used the "any" criteria. And it is (2) H100 cards.
I also launched another pod with (2) A6000 cards and same issue, unfortunately
Did you use the H100 PyTorch template?
no I used the oficial Ubuntu template. is that not compatible for some reason with the card?
Solution
Official Ubuntu template or Docker image? You should use the RunPod H100 PyTorch image, its the only one that has Torch custom built for H100. Others will not work properly.
i just tried and yes youre right it was the setup that was wrong and this template works. Thank you!!! That was an easy detail to miss for a newb like me
I am also looking to run Docker within this environment, specifically to work with the LocalAI Docker containers. Could you please provide further clarification on whether the RunPod H100 PyTorch image supports running Docker? (actually using docker. I know it is installed on there, but it was not letting me create anything with it. I was ending up in a DnD situation. Docker in docker. Not permitted)
If so, are there any specific steps I should follow to enable Docker functionality on this image?
If Docker is not supported on the H100 PyTorch image, I would greatly appreciate any alternative suggestions or solutions to run LocalAI on RunPod.
Thank you for your help!!
Also being able to run a desktop like Ubuntu would be amazing if it's available, then I can install everything.
You can't run Docker in Docker on RunPod. A pod is already a Docker container. You can create a RunPod template to specify Docker configuration or use something like https://depot.dev to build Docker images.