Runpod GPU use when using a docker image built on mac

I am building serverless applications that are supposed to be using gpu, while testing locally, the pieces that kick off functions that are meant to be using gpu are denoted with the common: device: str = "cuda" if th.cuda.is_available() else "cpu" this is required so that when running locally on a mac, the cpu device is used. I would think that in a docker image built on a mac, but with a amd64 machine type specified in the build command, that when its deployed on a server that has a cuda base image, cuda gpu would be used. but that does not seem to be the case. I have not been able to understand why that is for the longest time. My runpod serverless pods only show cpu usage when tested. Any advice?
13 Replies
nerdylive
nerdylive5w ago
Try to test with cpu pods only and if it's the case ( it's only using cpu ) If it takes longer to use cpu pods (10 cores ish) probably it isn't using cpu only, and if it's the same time as gpu with cpu then yeah there might be a problem The stats isn't updated that often so it may be not accurate
zfmoodydub
zfmoodydubOP5w ago
good advice thank you. Ive tried to deploy the same image to cpu only pods using a heavy duty cpu but the same image fails to initialize in a cpu pod. probably because im using this base image: runpod/base:0.4.0-cuda11.8.0
nerdylive
nerdylive5w ago
Is it only the Stat from the website or you actually tested to print inside that logic (if) Maybe you can try that too to make sure if it detects Nvidia gpu Make sure to have Cuda inside your image, or use nvidia's base image from (ngc) search in Google Nvidia ngc
zfmoodydub
zfmoodydubOP5w ago
i do have a device type print log after most of those declarations, and it always says using cpu sorry m8 this one goes over my head a bit: "Make sure to have Cuda inside your image, or use nvidia's base image from (ngc) search in Google Nvidia ngc" I thought i would have cuda inside my image via the base image name... need to study up on what you mean by that thanks for the direction
nerdylive
nerdylive5w ago
What's ur base image?
zfmoodydub
zfmoodydubOP5w ago
runpod/base:0.4.0-cuda11.8.0
nerdylive
nerdylive5w ago
I think it has Cuda already yep What is th object here? From th.cuda.is_...
zfmoodydub
zfmoodydubOP5w ago
sorry @nerdylive i am not sure the answer to your question. that declaration is littered throughout some open source code multiple times.
nerdylive
nerdylive5w ago
I'm not sure what's causing the problem here Can you check the logs of the worker that ran Any errors maybe? Showing that gpu or Cuda isn't available
zfmoodydub
zfmoodydubOP5w ago
when i run a process in a particular pod that im seeing the issue, it does say that "cuda is not available using cpu" but another serverless pod (the one im talking to you about in a different thread) using the same base image does not have this problem. so i believe this is an internal code thing in my repository. after noticing that, i do not think this is a runpod problem. I can dig deeper there. thanks for your responses!
nerdylive
nerdylive5w ago
Hmm weird Maybe use another pod, try Have you?
zfmoodydub
zfmoodydubOP5w ago
i have not, not the most seasoned engineer, and havent had much luck successfully deploying my apps with anything else but with this base image: runpod/base:0.4.0-cuda11.8.0 so ive really only been using that base image for my apps for about a year
nerdylive
nerdylive5w ago
Ooh yeah I mean re deploy it in another pod maybe it works on another machine
Want results from more Discord servers?
Add your server