R
RunPod10mo ago
otakuhero

Container fails to start randomly

Container fails to start randomly error pod id 840b98harmlsgb wqaz2xufma32pt eqyabu82t6l3y9 system log: 2024-02-29T02:38:20Z start container 2024-02-29T02:38:21Z error starting container: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 534: elf_machine_rela_relative: Assertion ELFW(R_TYPE) (reloc->r_info) == R_X86_64_RELATIVE' failed! nvidia-container-cli: detection error: driver rpc error: failed to process request: unknown`
No description
No description
15 Replies
otakuhero
otakuheroOP10mo ago
Hi, I've encountered a new issue. When I create an RTX 4090 pod, it fails to launch. 😔 The system log is as follows. When I delete the pod and rebuild it, everything returns to normal. What could be the reason for this? I use secure cloud and custom image 2024-02-27T08:13:42Z start container 2024-02-27T08:13:43Z error starting container: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 534: elf_machine_rela_relative: Assertion ELFW(R_TYPE) (reloc->r_info) == R_X86_64_RELATIVE' failed! nvidia-container-cli: detection error: driver rpc error: failed to process request: unknown 2024-02-27T08:13:58Z start container`
ashleyk
ashleyk10mo ago
Provide pod id so RunPod can investigate.
otakuhero
otakuheroOP10mo ago
emmm, sorry, I forgot to record the ID. If this happens again, I will make sure to record it hi, @ashleyk I have found another pod with the same issue, pod id : 840b98harmlsgb I haven't deleted it yet
ashleyk
ashleyk10mo ago
Great, RunPod can check when they come online, I dont work for RunPod. May be a few hours because they are in the US.
otakuhero
otakuheroOP10mo ago
thanks for you help👍 , Let me wait until the runpod developers are online😢
ashleyk
ashleyk10mo ago
Maybe @Papa Madiator can help escalate for you when they come online.
Madiator2011
Madiator201110mo ago
Could you submit ticket on website it's better for hardware errors https://www.runpod.io/
Rent Cloud GPUs from $0.2/hour
Save over 80% on GPUs. GPU rental made easy with Jupyter for Tensorflow, PyTorch or any other AI framework.
otakuhero
otakuheroOP10mo ago
ok, let me find out how to submit a ticket.
Madiator2011
Madiator201110mo ago
when you press login
No description
otakuhero
otakuheroOP10mo ago
thanks!
Madiator2011
Madiator201110mo ago
@otakuhero could you also send your runpod email inside ticket on website? going to grab that machine id so you can turn off pod @otakuhero is pod issue happening on team account?
otakuhero
otakuheroOP10mo ago
yes , on team account I can shut down this pod now, right?
Madiator2011
Madiator201110mo ago
nvm do not have access to vie machine id. You can turn off pod but not delete it yet
otakuhero
otakuheroOP10mo ago
emmm, any results? Can I delete this pod?:poddy: 😢 @Papa Madiator hi, I encountered the same problem again, two pods failed to start, pod id:wqaz2xufma32pt & eqyabu82t6l3y9, I'm not sure whether it's caused by my custom images or infrastructure such as physical machines....
otakuhero
otakuheroOP10mo ago
No description
Want results from more Discord servers?
Add your server