ab Comments - Answer Overflow

•Created by ab on 8/8/2024 in #⚡｜serverless

Error getting response from a serverless deployment

Thanks. Actually I didn't even write any code. I was trying their serverless vLLM quick deploy template and making a call to the OpenAI compatible end point that shows up post deployment. It was very basic

15 replies

RRunPod

•Created by ab on 8/8/2024 in #⚡｜serverless

Error getting response from a serverless deployment

Actually even the subsequent requests were taking a long time. I tried with multiple models. Perhaps the documentation can be updated to reflect what exactly someone needs to do to have a faster inferencing service when hosted at runpod

15 replies

RRunPod

•Created by ab on 8/8/2024 in #⚡｜serverless

Error getting response from a serverless deployment

I made sure that I purchased enough credit and tried to run my POC but if this is the type of response, then I'm worried it might not work for us. I wonder how others are able to get fast inference

15 replies

RRunPod

•Created by ab on 8/8/2024 in #⚡｜serverless

Error getting response from a serverless deployment

It was 1 request and yes I'm using vllm.

15 replies

Gaming

Programming