thanatos121.
Security issue: Attackers Scanning Runpod pods?
Hello, over the past month or so, I have been noticing that whenever I spin up a new pod, I instantly start seeing these pings:
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: 100.64.0.33:33194 - "GET /v1/models HTTP/1.1" 200 OK
ERROR 01-10 08:39:01 serving_chat.py:114] Error with model object='error' message='The model
vllm-vl
does not exist.' type='NotFoundError' param=None code=404
INFO: 100.64.0.32:51002 - "POST /v1/chat/completions HTTP/1.1" 404 Not Found
INFO: 100.64.0.35:50500 - "GET /v1/models HTTP/1.1" 200 OK
ERROR 01-10 08:39:11 serving_chat.py:114] Error with model object='error' message='The model vllm-vl
does not exist.' type='NotFoundError' param=None code=404
INFO: 100.64.0.35:50500 - "POST /v1/chat/completions HTTP/1.1" 404 Not Found
INFO: 100.64.0.33:49030 - "GET /v1/models HTTP/1.1" 200 OK
ERROR 01-10 08:39:26 serving_chat.py:114] Error with model object='error' message='The model vllm-vl
does not exist.' type='NotFoundError' param=None code=404
Where "vllm-vl" is the name of my template and therefore the name of my pod.
I am not pinging this server, it happens nearly immediately after I spin it up.
My guess about what is happening is that attackers are identifying new runpod pod ids on the public registry. They then can assume that a fair number of these servers are running vllm, sglang, or tgi. They then "guess" about how to make an API call to the endpoint by using the pod name (not exactly sure how they get this) as the model name. Many templates simply have the model name as the template name so this is a fair assumption. They can then use this process to get free LLM calls on the communities pods.7 replies
Slow model download speeds/bandwidth
Can anyone explain to me why the download speed is so bad from huggingface on Runpod? I consistently get 10-30 MB/s download speeds compared to 100+ MB/s on Vast.ai. I have often had to have instances running for 1-5 hours just to download LLama 3 70b or LLaVa 34b. Quite frankly, this issue is so bad that it has pushed me to vast.ai for most model training. The only issue I have seen with vast, is I can't select 5 GPUs instead of 4 or 8 which is the required amount for our use-case. Running batch production deployments is also a pain because of the download speeds. Your platform is really suffering because of this problem.
2 replies