open ai retries the request
i am running llama 3.1 8b on the serverless vllm using 48gb pro config whenever my local api send the request to the server it s visible on the page that the request is in progress but during this time the open ai automatically retries the same request even if the existing request is in progress and this loop continues. when the first req which was made is completed the response is visible on the run pod dsahboard but the local api is still in the loop of retrying
1 Reply
Also even if i cancel the extra retry req manually the first req which was made completes and is visible on dashboard but the error code 500 is sent to the user but the ouput is visible on the runpod dashboard