naaviii
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
No worries @NERDDISCO , could you give me an ETA if possible ? so that our team can plan accordingly
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hello @Tim aka NERDDISCO is there any update from the team , regarding the issue ?
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hello Tim , is there any update from the team , regarding the issue ?
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Hi Tim , just checking for any update ?
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Thanks a lot Tim, for the quick response 🙂
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
Please help , need you support
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
This was working fine few days back , are there major changes done in library versions ?
22 replies
RRunPod
•Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
yes , Infact now I am getting error without using network volume , from openai import OpenAI
api_key = "xxxxxxxxx"
endpoint_id = "vllm-xxxxx"
client = OpenAI(
base_url=f"https://api.runpod.ai/v2/{endpoint_id}/openai/v1",
api_key=api_key,
)
# Create a completion
response = client.completions.create(
model="microsoft/Phi-3.5-mini-instruct",
prompt="Runpod is the best platform because",
temperature=0,
max_tokens=100,
)
print(response)
# Print the response
print(response.choices[0].text)
################Output###############################
{
"delayTime": 104,
"error": "handler: 'NoneType' object has no attribute 'headers' \ntraceback: Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 192, in run_job_generator\n async for output_partial in job_output:\n File \"/src/handler.py\", line 13, in handler\n async for batch in results_generator:\n File \"/src/engine.py\", line 151, in generate\n async for response in self._handle_chat_or_completion_request(openai_request):\n File \"/src/engine.py\", line 179, in _handle_chat_or_completion_request\n response_generator = await generator_function(request, raw_request=None)\n File \"/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_completion.py\", line 129, in create_completion\n raw_request.headers):\nAttributeError: 'NoneType' object has no attribute 'headers'\n",
"executionTime": 1191,
"id": "sync-9c9ccd0f-7e42-4f6a-8c5d-d430004b399f-e1",
"status": "FAILED"
}
This is the basic code that I have used22 replies