R
RunPod11mo ago
jamesk

Unable to connect to Jupyter lab

Seems like Jupyter lab has crashed on my pod after a job running for around 2 days . This is unfortunate . Is there anyway I can restart jupyter lab so that I can resume training ? Is it also possible that my process may still be running despite Jupyter lab having crashed ?
No description
3 Replies
jamesk
jameskOP11mo ago
From container logs:
2024-04-10T15:28:37.183318778Z /start.sh: line 74: 72 Killed nohup jupyter lab --allow-root --no-browser --port=8888 --ip=* --FileContentsManager.delete_to_trash=False --ServerApp.terminado_settings='{"shell_command":["/bin/bash"]}' --ServerApp.token=$JUPYTER_PASSWORD --ServerApp.allow_origin=* --ServerApp.preferred_dir=/workspace &> /jupyter.log
2024-04-10T15:28:37.183318778Z /start.sh: line 74: 72 Killed nohup jupyter lab --allow-root --no-browser --port=8888 --ip=* --FileContentsManager.delete_to_trash=False --ServerApp.terminado_settings='{"shell_command":["/bin/bash"]}' --ServerApp.token=$JUPYTER_PASSWORD --ServerApp.allow_origin=* --ServerApp.preferred_dir=/workspace &> /jupyter.log
digigoblin
digigoblin11mo ago
The Linux kernel killed off the process because the pod ran out of system memory (not VRAM). Your training process would have probably stopped as well when Jupyter was killed. Its probably better to SSH into the pod and start a screen or tmux session for your training rather than trying to run it within Jupyter. I would just reset the pod to get Jupyter to work again.
jamesk
jameskOP10mo ago
Nice one, thanks a lot

Did you find this page helpful?