Anders Posts - Answer Overflow

Anders

•Created by Anders on 3/15/2025 in #⚡｜serverless

Anyone get vLLM working with reasonable response times?

Seems that no matter how I configure serverless with vLLM, the workers are very awkward in picking up tasks and even with warm containers tasks in the queue sit around for minutes for no obvious reason. Has anyone actually been able to use serverless vLLM for a production use case?

6 replies

Gaming

Programming