Training flux-schnell model
How do you manage to train a flux-schnell model using serverless, i have loaded the images using s3 Bucket, but what about the waiting time of the training process ? wont i get a timeout while waiting 20 min for the training process to end?
9 Replies
serverless is design for inference, it's simpler to just use a pod.
the thing is i want the scale up feature and pay only for usage
Are you planning to train the model or just run inference? You can start and stop the pod to save on costs as well.
I want to do both, but the inference part its easy, the training one its tricky, i tested the training using pod, the thing is i now want it to scale up when i want to make more than one training at once
And if i stop, when i start it again i may have lost my GPU, thats happeing all the time, and its a pain in the ass
try to use the network storage and save your stuff there, that way you won't tie to a specific machine.
but i still want the ability to scale autonomously
have you manage to use the network storage?, i cant really find any documentation on that (for thye serverless)
hi, how you train flux on pod? is there any docker image? I meann I am looking for no gui solution
Me to, I am using ai-toolkit
With serverless you do not have access to many of the higher level GPU like you do with pods. Serverless was designed with inference in mind, not training. Not saying you cannot train on serverless but you are blazing your own path in doing so.