RunPod•2mo ago

Trouble training sdxl lora with kohya

It seems as if its getting stuck in the process, anyone else having the same issues?

7 Replies

FluxzOP•2mo ago

I am using a 4090

Jason•2mo ago

any errors? any other details? maybe share the logs

FluxzOP•2mo ago

0250319-160845.toml 16:08:45-110488 INFO Command executed. 2025-03-19 16:08:50.583194: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2025-03-19 16:08:50.583238: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2025-03-19 16:08:50.584396: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-03-19 16:08:50.590100: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. this is where it gets stuck

RollyPolly•2mo ago

Seen similar errors on cuda version mismatch between your template or docker image and the GPU Check if your script can recognize the gpu as cuda device, it may be defaulting to cpu

FluxzOP•2mo ago

Im kind of green in this area, how would I do that?

RollyPolly•2mo ago

I can’t give concrete instructions idk if you’re on a template or image or what, Google is your friend, it’s well documented and common problem

Jason•2mo ago

ya maybe your docker image must be built with cuda's base image from nvidia and your package tensorflow there should be the gpu version

Gaming

Programming

Trouble training sdxl lora with kohya

Did you find this page helpful?