Rodka
Rodka
RRunPod
Created by Rodka on 5/27/2024 in #⚡|serverless
How to schedule active workers?
e.g., I want 0 active workers from 8pm to 3am, and 1 active worker from 3am to 8pm.
11 replies
RRunPod
Created by Rodka on 5/22/2024 in #⚡|serverless
Question on Flash Boot
Hello. I'm aware Flash Boot is more or less a "caching system" that keeps a worker on stand-by for some time, preventing large delay times. For example, my first request of the day takes from 8s to 15s of delay time, and subsequent requests have much faster delay times even with 5s idle timeout -- which I guess comes from the flash boot being enabled. I know that Flash Boot will stop working after an X amount of time has passed with no requests in the "cached worker"; in this case my new request after this X amount of time will take the "8s to 15s" delay time. My question is, what exactly is this "X"amount of time? Is it an exact value?
4 replies
RRunPod
Created by Rodka on 5/17/2024 in #⚡|serverless
Warming up workers
Hi. I've been noticing some substantial delay times and I'd like to know if there's a built-in tool in RunPod that lets me "warm up" my workers before the user will use it. I'm aware of idle timeout which can help in some situations, but if possible I'd like to keep costs to a minimum. If there's not a "built-in" solution to this, then I can just implement a warm-up logic myself. I just wanted to check this first, since I'm running faster-whisper models and sometimes I can have more than 10s of delay time, which is too much.
3 replies
RRunPod
Created by Rodka on 5/16/2024 in #⚡|serverless
Serverless GPU Pricing
Hello. I chose a 24 GiB configuration with the following GPUs: L4, RTXA5000, and RTX3090. I ran some benchmarks and noticed that using only RTX3090 is better for my use-case (faster execution times and so on). Is the base pricing for all these 3 GPUs the same? That is, supposing for a moment that the delay times and execution times are the same across all GPUs, will the billing result in the same value regardless of the one I choose?
7 replies
RRunPod
Created by Rodka on 5/9/2024 in #⚡|serverless
Running fine-tuned faster-whisper model
Hello. Is it possible to run a fine-tuned faster-whisper model using RunPod's faster-whisper endpoint? Furthermore, does it work on a scale of hundreds of users using it at the same time?
6 replies