dudicious
dudicious
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
If you are using ENFORCE_EAGER you should be able to increase GPU_MEMORY_UTILIZATION and MAX_MODEL_LENGTH on a 48gb endpoint
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
I wonder if there is some kind of bug with CUDA graphs on quantized models? They always take up way more memory than I'm expecting
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
I've had trouble with some quantized models if I don't use eager mode.
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
ENFORCE_EAGER?
48 replies
RRunPod
Created by Jidovenok on 2/21/2024 in #⚡|serverless
All 27 workers throttled
Still getting throttled constantly. Serverless doesn't seem viable in its current state. Bummer. The tech is cool.
239 replies
RRunPod
Created by Jidovenok on 2/21/2024 in #⚡|serverless
All 27 workers throttled
48gbs were all throttled in CA today too.
239 replies
RRunPod
Created by dudicious on 2/15/2024 in #⚡|serverless
Started getting a lot of these "Failed to return job results" errors. Outage?
1.6.0
5 replies
RRunPod
Created by dudicious on 2/15/2024 in #⚡|serverless
Started getting a lot of these "Failed to return job results" errors. Outage?
Nothing changed in my configurations. I just went from getting good results to this error on 100% of jobs. Status shows COMPLETED in the logs, which doesn't seem right either.
5 replies