NGC containers
Has anyone gotten NGC containers running on runpod? I see it as an option but I think it doesn't work because you need to install the ssh libraries on top.
I need this to use FP8 on H100s since the PyTorch NGC container includes Transformer Engine for FP8. Building Transformer Engine manually takes a long time (requires downloading a cudnn tarball from NVIDIA website).
40 Replies
Yes
Is there any docs or quick example on how to use it?
Any update on this?
CC @mmoy
Not on runpod docs, but I'm sure the way to use them on runpod is by creating a template (dockerize your needed apps) then run them on pods or serverless
And creating runpod templates are documented ( for example but not specifically for ngc containers)
Use them as base images and just do what you need to fill in the image and use them as a template
yeah I got that, I guess I was just too lazy to add the required ssh libs and create that template
I also didn't understand why RunPod PyTorch NGC containers are available in the dropdown selection if the limitations are known. Maybe I'm just not using it correctly?
I can always take comissions
What limitations?
how do you use a container and "Connect" if there's no SSH access? There's also no option to SSH into the host and use the container interactively
So I'm not sure what you can do after deploying a "RunPod PyTorch NGC" template
if you run bare image you might need to set container command to
but how would I SSH into the container or is the SSH command for host machine with Docker access?
anyways I think I can create a template to fix it with my remaining few dollars of credits 😅
there is ssh command in the connect button
after you press it you press ssh
oh wait this is not pods
why would you want to ssh into serverless containers
this is for pods, a pod still runs a container
a pod doesn't give you access to the host machine
Give me like 1h will build container for you
wow rare moment
any specific docker image as base?
@sbhavani
latest container from a few days ago:
nvcr.io/nvidia/pytorch:24.04-py3
I think it should work note volume storage wont be /workspace
@sbhavani btw you wanted template for pods?
Note image requires host with CUDA 12.4
@sbhavani so you have any code for test?
yes template for pods, I guess it depends on the driver version for the host too
I got template done just need to run some test and if you have any small code to test 8bit quant let me know
https://github.com/NVIDIA/TransformerEngine?tab=readme-ov-file#pytorch - sample code here!
GitHub
GitHub - NVIDIA/TransformerEngine: A library for accelerating Trans...
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio...
what kinda output shall I get from it?
hmm actually that code is more functional testing, I don't have anything readily available to test perf/speed up
I can clean up this repo and add a HF LLama-2/3 example comparing BF16 and FP8 throughput: https://github.com/sbhavani/h100-performance-tests
I kinda run it and not getting anything not output or error
then sounds like it works! if you publish to the community I'll test it out as well
It should be cached on H100 PCIe CA region
on secure cloud at leas
@sbhavani https://runpod.io/console/deploy?template=lc5dch2fuv&ref=vfker49t
template name pytorch-ngc-runpod
password for jupiter is RunPod
volume storage is being mounted at /vol btw
@sbhavani let me know if it worked for you
not much rare im happy to help build templates but not if you ask me to add 50 models from civati ai
thanks! I'll test it out on friday!
How about 20
I can build you container that would block access to civati ai
Sure please share the dockerfile for me
lol
im looking for a pytorch docker container without runpod
can i just do a docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.10-py3?
i want to use pytorch with sentence transformers from huggingface (https://github.com/huggingface/setfit) and do a torch.compile and run predictions
GitHub
GitHub - huggingface/setfit: Efficient few-shot learning with Sente...
Efficient few-shot learning with Sentence Transformers - huggingface/setfit
Sure use the right docker command
Well add your code to use the models
has someone tried torch.compile?
Not me, I haven't tried it
where can i find which torch-tensorrt version is compatibel with cuda, torch etc?
is it expected that pip install torch-tensorrt==2.2.0 installs both: nvidia-cuda-runtime-cu11 and nvidia-cuda-runtime-cu12 .. same for nvidia-cudnn-cu11 and nvidia-cudnn-cu12 ... and some other nvidia packages?
and does torch-tensorrt work with an older gpu like a g4dn.xlarge?
I dont know about that
sure try it, just select a right driver
But, aws has its own support its best to ask there for best support in their products
@Geri Take a look at the versions used in https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html. That should give you an idea of compatibility across torch and nvidia packages
Oh there's that matrix thanks for sharing it
hi does someone know to configure a config.pbtxt for onnx or pytorch?