blistick Comments - Answer Overflow

blistick

Posts Comments

RRunPod

•Created by blistick on 1/5/2024 in #⚡｜serverless

What does "throttled" mean?

Yes, official guidance would be good. Like you, I don't want to incur the cost of always active.

16 replies

RRunPod

•Created by blistick on 1/5/2024 in #⚡｜serverless

What does "throttled" mean?

Constant throttling is rather scary from a production standpoint.

16 replies

RRunPod

•Created by blistick on 1/5/2024 in #⚡｜serverless

What does "throttled" mean?

@justin @ashleyk Thank you both very much for this advice. To summarize it seems I should, (a) have at least 2 max workers, and (b) enable as many regions as possible for my endpoint. (@justin I followed your previous advice about improving worker startup time by NOT using a network drive (which really helped, btw) but I forgot to edit my endpoint to allow more regions.)

16 replies

RRunPod

•Created by blistick on 12/26/2023 in #⚡｜serverless

Slow model loading

Haha. Well, if you have ZERO clue that must mean I have NEGATIVE clue. I'll check out those resources.

22 replies

RRunPod

•Created by blistick on 12/26/2023 in #⚡｜serverless

Slow model loading

@justin This all makes sense. I've got some work to do, but at least there's a path forward to better performance. Thanks again! (And Happy Holidays!)

22 replies

RRunPod

•Created by blistick on 12/26/2023 in #⚡｜serverless

Slow model loading

@justin Thanks very much for this useful information. Lots to try! I'm confused about your suggestion to keep the models in memory. I already load them outside of my handler function, but even for an "active" worker, that code isn't executed until a request comes in, at least from what I can tell. Maybe active workers will only execute that code once, so they do stay in memory. I'll do more tests to see. Of course, for cost reasons I'd rather not need to keep an active worker.

22 replies

Gaming

Programming