Hi,
I am trying to send a file from my local system to my pod volume using this command
rsync -e "ssh -p 10234 -i /home/dell/ssh_keys/ssh_key_dell_Latitude_A4213.txt" -avP /home/dell/exp10/conda_env.zip [email protected]:/workspace/testing/
but when I run this I get this error
ash: line 1: rsync: command not found
rsync: connection unexpectedly closed (0 bytes received so far) [sender]...
Jupyter notebook - does it keep on running?
I am using Jupyter notebook on my pod, can I close the tab, will it keep running?
Open-WebUI 404 Error
When using the Better Ollama CUDA 12 template, and following the instructions found here: blog.runpod.io/run-llama-3-1-405b-with-ollama-a-step-by-step-guide, getting an error when posting a query using open-webui: Ollama: 404, message='Not Found', url='https://<snip>-11434.proxy.runpod.net/api/chat'
Interestingly enough, replacing the open-webui localhost URL with the above URL works well with cURL using network diagnostics.
Wanted to replicate the issue on a less expensive server, but can no longer find the template....
Why is upload speed so slow?
A week back when I downloaded a 6BG checkpoint, it took 1-2 hours. Now it's telling me it'll take 12 hours. Is there a reason for this?
GPU errored, machine dead
Search
0 matches
2024-09-04T11:12:09Z stop container
2024-09-04T11:12:44Z remove container...
Slow Container Image download
Two EU datacenters are experiencing extreme slowdown during docker container image download, EU-SE-1 and EU-RO-1, to the point where our scaler can't keep up with load spikes because it takes > 30 minutes to start up a pod.
This needs to be resolved as it's directly costing us money, we can't properly scale, causing our queue to keep spiking and building. Alongside being forced to use on-demand vs spot because of the slow download speed....
Can I specify CUDA version for a pod?
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.4, please update your driver to a newer version, or use an earlier cuda container: unknown
vLLM based container image fail to start...
Solution:
In deploy click Filters and you can specify Cuda version there.
Pods wont start
Looks like auth to hugging face failed, cannot launch any pods - tried with multiple configs, same result. Clicking on start web terminal does nothing, sometimes connect to jupyter button appears but does not do anything.
Pod ID: 5d15c6q1grfm6p
```
.254316737Z ...done....
create POD with full Intel Sapphire Rapids CPU chip for Parallel Algorithm scalability test.
Hi,
I usually create PODs for GPU tasks, accessing through ssh, so I am very familiar in that sense. But now we need to rent a POD with just a modern Intel CPU fully available for us. In particular, we need one with Intel Sapphire Rapids architecture, so that it supports AMX matrix instructions. This is for a parallel CPU algorithm for which we need to obtain performance and energy consumption results (plots).
I went to the menus of runpod but i could not find options on the CPU side, neither exact info of the CPU model of the pod. Am i missing something too obvious?
Thanks in advance...
My pod had been stuck during initialization
ogw47gdxzk3a26 - stuck during image pulling. Could you checkout what happened and handle that issue, because our infra is not ready to handle this kind of your errors.
Creating instances with a bunch of open ports
I'm using several gpu pods. I faced the the lack of open ports. afaik, while creating instances, the number of ports is restricted. Only support at most 10 ports.
How can I get 20 ro 30 ports while creating an instance?...
creating instance from an image file
i want to make an image from an image file (faster than using registry), any idea how to do it? i prefer to use the runpod storage, because it is faster that way.
Creating pods with different GPU types.
Hello, Can I create pods with different GPU types? Say I want to create a pod with 2 A40s and 1 RTX A5000. I asked because I there is a gpuTypeIdList property on the runpod graphql specs. Also, it would be amazing to have that feature. Thanks!
Slowish Downloads
I'm trying to setup a pod running ComfyUI for Flux at the moment, and it's going to take 30-40 mins just to download the models with the speed it's running at.
```Downloading 1 model(s) to /workspace//storage/stable_diffusion/models/unet...
Downloading: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors
0K .......... .......... .......... .......... .......... 0% 10.9M 34m23s...
can't cloud sync with Backblaze B2
I need help, I can't do cloud sync with Backblaze B2
I put the key ID and the application key and the bucket root path
but it says Something went wrong!...
How do i deploy a Worker with a Pod?
I have deployed a worker with a Serverless deployment, now i expected to be able to deploy the exact same image to a Pod and be able to have an endpoint URL to make a similar Worker request, but i'm not having success?
I am currently using the following as the initial entrypoint for handler.py...
Is there any doc that discusses how to get a Serverless Worker deployed to a Pod?
thx....
runpod.serverless.start({"handler": handler})
runpod.serverless.start({"handler": handler})
Funds not appearing in account balance
Hi - I deposited 300 dollars in my account. I got emailed the receipt. But the funds haven't been deposited as credit - could you look into this please?
Very inconsistent performance
I recently started using Runpod - and am a fan of the setup simplicity and pricing. I have recently noticed a huge amount of inconsistency in performance with identical training runs taking up to 3x longer to finish. I am on the secure cloud. Do you know why this may be?
Can someone help me fix my tensorflow installation to see the gpu?
I've been trying to fix this for over a week.
Running the official template with pytorch 2.1.0, cuda 118...