ab
ab
RRunPod
Created by ab on 8/8/2024 in #⚡|serverless
Error getting response from a serverless deployment
Thanks. Actually I didn't even write any code. I was trying their serverless vLLM quick deploy template and making a call to the OpenAI compatible end point that shows up post deployment. It was very basic
15 replies
RRunPod
Created by ab on 8/8/2024 in #⚡|serverless
Error getting response from a serverless deployment
Actually even the subsequent requests were taking a long time. I tried with multiple models. Perhaps the documentation can be updated to reflect what exactly someone needs to do to have a faster inferencing service when hosted at runpod
15 replies
RRunPod
Created by ab on 8/8/2024 in #⚡|serverless
Error getting response from a serverless deployment
I made sure that I purchased enough credit and tried to run my POC but if this is the type of response, then I'm worried it might not work for us. I wonder how others are able to get fast inference
15 replies
RRunPod
Created by ab on 8/8/2024 in #⚡|serverless
Error getting response from a serverless deployment
It was 1 request and yes I'm using vllm.
15 replies