Performance A100-SXM4-40GB vs A100-SXM4-80GB
Hello!
I have one GPU: NVIDIA A100-SXM4-40GB on Google Colab Pro.
I have one GPU: NVIDIA A100-SXM4-80GB on RunPod.
My notebook successfully fine-tunes Whisper-Small on Google Colab (40GB) with batch size 32.
However, when I run the same notebook on RunPod (80GB), I get a GPU out of memory error; it only works with batch size 16.
Any explanation and solution as why A100-SXM4-80GB cannot run the same batch size used on A100-SXM4-40GB?
Thanks!
2 Replies
Solution
could be diffrent things like cuda version, python version etc
Thanks! Moving from a template that uses PyTorch 2.0.1 and CUDA 11.8.0 to a template that uses PyTorch 2.2.0 and CUDA 12.1.1 solved the issue.