R
RunPod8mo ago
jamesk

Unable to connect to Jupyter lab

Seems like Jupyter lab has crashed on my pod after a job running for around 2 days . This is unfortunate . Is there anyway I can restart jupyter lab so that I can resume training ? Is it also possible that my process may still be running despite Jupyter lab having crashed ?
No description
3 Replies
jamesk
jameskOP8mo ago
From container logs:
2024-04-10T15:28:37.183318778Z /start.sh: line 74: 72 Killed nohup jupyter lab --allow-root --no-browser --port=8888 --ip=* --FileContentsManager.delete_to_trash=False --ServerApp.terminado_settings='{"shell_command":["/bin/bash"]}' --ServerApp.token=$JUPYTER_PASSWORD --ServerApp.allow_origin=* --ServerApp.preferred_dir=/workspace &> /jupyter.log
2024-04-10T15:28:37.183318778Z /start.sh: line 74: 72 Killed nohup jupyter lab --allow-root --no-browser --port=8888 --ip=* --FileContentsManager.delete_to_trash=False --ServerApp.terminado_settings='{"shell_command":["/bin/bash"]}' --ServerApp.token=$JUPYTER_PASSWORD --ServerApp.allow_origin=* --ServerApp.preferred_dir=/workspace &> /jupyter.log
digigoblin
digigoblin8mo ago
The Linux kernel killed off the process because the pod ran out of system memory (not VRAM). Your training process would have probably stopped as well when Jupyter was killed. Its probably better to SSH into the pod and start a screen or tmux session for your training rather than trying to run it within Jupyter. I would just reset the pod to get Jupyter to work again.
jamesk
jameskOP8mo ago
Nice one, thanks a lot
Want results from more Discord servers?
Add your server