RunPod•2mo ago

AI Toolkit Lora Training torch.OutOfMemoryError

I've tried different pods for Flux Lora Training on AI Toolkit and couldn't get any luck at all. I even used 2 x RTX 4090 24 vCPU 62GB RAM and it was also reporting “torch.OutOfMemoryError”. How could that be??? The RTX 6000 Ada 48 GB VRAM 188GB RAM 24 vCPU could start the training process but it took more than 10 minutes (!!) to generate sample image and it was practically not visible (denoise <0.2). How's that? Template: runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04 (officially recommended) Optimized: adamw Model: Flux.Dev low_vram: false quantize: false

2 Replies

Madiator2011 (Work)•2mo ago

could be your training app not support multi gpu

fstairuOP•2mo ago

Indeed. It does't support muli gpu. Hhhh

Gaming

Programming

AI Toolkit Lora Training torch.OutOfMemoryError

Did you find this page helpful?