Andrew_Rocket
24 GB VRAM is not enough for simple kohya_ss LORA generation.
How come 24GB VRAM is not enough for generating simple Lora in kohya_ss?
I've tried running it with the simplest configuration: 32 pictures, fp16, AdamW8bit, no batching, or other demanding features, cuda constantly runs out of memory. I've tried launching it 4-5 times, clearing the cache, setting the limit in PyTorch as well as making sure nothing else is using VRAM. It still runs out of memory every time.
The funniest part is that I've successfully launched the same configuration on my old PC with GTX 960 4GB, it is slow, but it does not run out of VRAM. Why pods here can't handle it?
I ended up running it on a 48GB VRAM instance and it uses around 33 GB of it. Why my 4GB card can run it with pretty much the same config? Is it possible to achieve the same result here?
12 replies