Serverless doesn't work properly when docker image is committed
I built the image locally using the following command and it works fine after submitting it to serverless.
sudo docker build -t xsjiang/rp-comfyui:t1 --platform linux/amd64 .
I then ran this image locally and built a second image using the following command.
sudo docker run --runtime=nvidia -it -v runpod-model:/runpod-volume -p 8188:8188 -p 8000:8000 --network host --name comfyui xsjiang/rp-comfyui:t1 /bin/bash
sudo docker commit comfyui xsjiang/rp-comfyui:t2
After I submit the image xsjiang/rp-comfyui:t2 to serverless, it doesn't work, it always keeps repeating start container.
2024-01-12T06:10:22Z Status: Downloaded newer image for xsjiang/rp-comfyui:1.4-1
2024-01-12T06:10:22Z worker is ready
2024-01-12T06:19:37Z create pod network
2024-01-12T06:19:37Z create container xsjiang/rp-comfyui:t2
2024-01-12T06:19:38Z 1.4-1 Pulling from xsjiang/rp-comfyui
2024-01-12T06:19:38Z Digest: sha256:a89561bc7e6f5fd89cbffc3a8e8b444135e129a245027f24049690a62804b12a
2024-01-12T06:19:38Z Status: Image is up to date for xsjiang/rp-comfyui:t2
2024-01-12T06:19:38Z worker is ready
2024-01-12T06:19:38Z start container
2024-01-12T06:19:54Z start container
2024-01-12T06:20:10Z start container12 Replies
what is the startup command set?
CMD start.sh
start.sh
python3 -u rp_handler.py
do u think can share more? like ur docker file + also the docker start command / screenshot of how u set it up in runpod?
Images built using docker build are fine for runpod, but images made using docker commit are not.
You can make a base image, then after running that image locally, you can use
docker commit
to make a new image to submit to runpod, which should be fine.Sorry what does this mean?
So docker build works? but otherwise doesn't?
Docker commit might be doing something weird.. cause ur capturing it into a weird state
myimage1.0 work but myimage1.1 not work
Is there a reason why uy are using docker commit?
Vs just building it properly?
Ur code looks similar to mine?
https://github.com/justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-RUNNING-EXAMPLE
Like you seem to be splitting one up for GPU pod > then one for handler?
GitHub
GitHub - justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-...
AudioCraft public example runpod. Contribute to justinwlin/AudioCraft-Runpod-Serverless-and-GPU-Pod-NOT-A-RUNNING-EXAMPLE development by creating an account on GitHub.
1) Maybe just as a side step, consider using pytorch base image from runpod
2) Maybe stay away from docker commit, b/c it captures like intermediate changes into an image, but its hard to always debug / process if that is working
3) if you split it up like my github repo, similar to what ur already doing, one for GPU pod, then one where all u do is add the handler.py and start it, it will be much easier / simplify ur process
why are you using start.sh?
Thanks, I'll try to build the full image using Dockerfile first
Yup yup! I think try to validate u can get it working in GPU pod, cause once u do, all it is is adding the handler.py to call it 🙂 with runpod.serverless.start()
And makes it easier to debug in the future too ur logic and iterations