Serverless Deployment runpod request Issue
Im working on deploying the qwen_2.5_instruct model through RunPod using the vLLM direct deployment method. The qwen_2.5_instruct model is designed to more than one image at a time along with the prompt. However, with the vLLM method, RunPod only allows one image per request.
I need to pass multiple images in the following format:
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "file:///path/to/image1.jpg"},
{"type": "image", "image": "file:///path/to/image2.jpg"},
{"type": "text", "text": "Identify the similarities between these images."},
],
}
]
I tried creating a handler.py for a custom container deployment, but it failed. The process consumes too much network bandwidth while building the Docker image and uploading it to RunPod.
How can I resolve this issue?

0 Replies