Serverless Deployment runpod request Issue

Im working on deploying the qwen_2.5_instruct model through RunPod using the vLLM direct deployment method. The qwen_2.5_instruct model is designed to more than one image at a time along with the prompt. However, with the vLLM method, RunPod only allows one image per request. I need to pass multiple images in the following format: messages = [ { "role": "user", "content": [ {"type": "image", "image": "file:///path/to/image1.jpg"}, {"type": "image", "image": "file:///path/to/image2.jpg"}, {"type": "text", "text": "Identify the similarities between these images."}, ], } ] I tried creating a handler.py for a custom container deployment, but it failed. The process consumes too much network bandwidth while building the Docker image and uploading it to RunPod. How can I resolve this issue?
No description
0 Replies
No replies yetBe the first to reply to this messageJoin

Did you find this page helpful?