Are Pods good for batch inference of encoders?

Hello all, I want to deploy an encoder. Think of BERT model (like "bert-base-uncased" on huggingface) with some aggregation header, such as, predicting class probabilities. However, I do not want to use that model in real-time but for batch inference. Typical scenario: I need to predictor for 1 Mio records within 10 mins. I need my GPU nodes to scale up from 0 to n, process those mio records stored on cloud storage, create predictions on cloud storage, and scale down from n to 0. I accomplished this with Azure ML using ther batch inference endpoints. It was a horrible experience for many reasons, including time. My question: would RunPod be a great fit for such a use case? Thanks, Paul
Solution:
I'd suggest reading about skypilot, since runpod doesn't provide like a platform for batch inference
Jump to solution
2 Replies
Solution
nerdylive
nerdylive6mo ago
I'd suggest reading about skypilot, since runpod doesn't provide like a platform for batch inference
kinsvater
kinsvaterOP6mo ago
OK, thanks.
Want results from more Discord servers?
Add your server