naaviii
naaviii
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
No worries @NERDDISCO , could you give me an ETA if possible ? so that our team can plan accordingly
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hello @Tim aka NERDDISCO is there any update from the team , regarding the issue ?
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hello Tim , is there any update from the team , regarding the issue ?
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hi Tim , just checking for any update ?
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Thanks a lot Tim, for the quick response 🙂
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Please help , need you support
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
This was working fine few days back , are there major changes done in library versions ?
22 replies
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
yes , Infact now I am getting error without using network volume , from openai import OpenAI api_key = "xxxxxxxxx" endpoint_id = "vllm-xxxxx" client = OpenAI( base_url=f"https://api.runpod.ai/v2/{endpoint_id}/openai/v1", api_key=api_key, ) # Create a completion response = client.completions.create( model="microsoft/Phi-3.5-mini-instruct", prompt="Runpod is the best platform because", temperature=0, max_tokens=100, ) print(response) # Print the response print(response.choices[0].text) ################Output############################### { "delayTime": 104, "error": "handler: 'NoneType' object has no attribute 'headers' \ntraceback: Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 192, in run_job_generator\n async for output_partial in job_output:\n File \"/src/handler.py\", line 13, in handler\n async for batch in results_generator:\n File \"/src/engine.py\", line 151, in generate\n async for response in self._handle_chat_or_completion_request(openai_request):\n File \"/src/engine.py\", line 179, in _handle_chat_or_completion_request\n response_generator = await generator_function(request, raw_request=None)\n File \"/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_completion.py\", line 129, in create_completion\n raw_request.headers):\nAttributeError: 'NoneType' object has no attribute 'headers'\n", "executionTime": 1191, "id": "sync-9c9ccd0f-7e42-4f6a-8c5d-d430004b399f-e1", "status": "FAILED" } This is the basic code that I have used
22 replies