A100 OneTrainer stuck in downloading loop
Hi there, I am trying out the OneTrainer template and it seems to be broken for the A100 GPU's. After everything has been downloaded, it seems to go back to the "Extracting" part again and starts downloading again, like an infinite loop.
This does not happen for a RTX4090 for example...
OneTrainer
Hi there, I am trying out the OneTrainer template but how do I see the actual GUI? Since there is no web app. Is there some VNC connection?
Locking down web accessible items
So I have a stupid question, still trying to understand how all of this works. whenever I deploy a pod, for Text Generation Web UI and APIs I get a list of services that are available via http from the get go.
I already added a key so the web interface is locked but I still need to lock down the file upload, visual studio code, and juniper notebook.
Does anyone have a where I can read up on how to do this?...
How to fetch more than 8 gpus on RunPod (2 nodes)
Hi, Usually RunPod provides max 8 gpus, how i can fetch more than 8 gpus?
Environment Variables are not set in "SSH over exposed TCP" for the root user
The template env variables are not available in the "SSH over exposed TCP" connection which is the root user. I am wondering if the reason for this is because they are only set for the "Basic SSH" user?
Environment variables missing
Hello,
I am creating a pod with environment variables, but it doesn't seem to work. When I connect via SSH, and echo $ENV_VAR_NAME it prints nothing.
Am I missing something?
Also, printenv doesn't show the default environment variables from runpod, nor my added environment vars.
I am using my own template, but the Docker image is build on top of an official runpod image....
Solution:
Its docker yeah, environment in linux is per user i guess, so when you login using ssh, your env's wont be there because the docker container starts your application as a different user
GPU Memory Used Issue
Can anyone please explain to me why I am using 93% of my memory while running nothing? I imported my comfyUI workflow but not actively running the flow
Are there any way I can allocate it to reduce it?...
Avast Antivirus detects Runpod as a Trojan Virus
Hello
For some reason my Anti-Virus detects api.runpod.io as a Trojan Horse virus, which renders the service unusable.
The Alert pops up after selecting a GPU to use, then clicking on the "Change Template" button to select a Pod Template....
vulkan
the GPU-accelerated remote desktop, where can i find details on the GPU? i see openGL is installed, can i install vulkan as well?
Quick Question for SSH connection
Quick question for you guys. I've executed train.py from my remote container via VSCode ssh connection. I don't know why that execution killed or stopped? right after wifi disconnection. Is there anyone help me out? I didn't stop my pod.
Solution:
i prefer to use tmux
Failed to initialize NVML: Unknown Error
(compress) root@1908bfec7b85:/workspace# nvidia-smi
...
Failed to initialize NVML: Unknown Error
Failed to initialize NVML: Unknown Error
Runpod fast Stable Diffusion does not working
So I started pod today, but it failed. I think it is related to the Last Ben's repo. Here are part of the loggs:
2024-07-19T10:54:54.462970632+02:00 Starting Nginx service...
2024-07-19T10:54:54.474794604+02:00 * Starting nginx nginx
2024-07-19T10:54:54.482832995+02:00 ...done.
2024-07-19T10:54:54.483416353+02:00 Running pre-start script......
Help with Stable-diffusion-Automatic(1111) webui
I could not find IP-Adapter control type in controlnet on webui ,webui is not showing IP-Adapter option to select ,kindly help me in that
Help with Connections from my website
I want to create a web application using a pod but I don't know how I can make connections from my website if every time I restart the pod the IP and TCP port mapping change. Any experience with this? Any tutorial? Thank you so much!
Solution:
IP and TCP port can be accessed from env variable
so you can figure something out to pass them into your other code
this...
Cannot start fast stable diffusion notebook
It was fine this morning but now im getting error messages when starting the pod such as invalid username and password. When following the link to the LastBen repo on huggingface it seems to no longer exist.
Solution:
Template has been discontinued/abandoned. RunPod should delist it.
how to stop scripts from replacing my configuration
I have a version of ComfyUI together with a version of python and it's dependendencies in a direcotry named /workspace on a storage volume. Everytime this pre_start script runs it wipes out my lib and starts over. I tried modifying the script but it gets regenerated every time I restart the pod. How to stop this?
Trouble with Pod HTTP Service Port
First time user of runpod.io.
Went through the steps of creating a Pod, used TheBloke Local LLMs One-Click UI
When I click on "Connect" I am not able to connect to the HTTP Service Port. It reads "Not Ready"....
Please help - Connect web to Pod
I am trying to connect my website to the Pod using WebSocket. My website is https and I get errors when trying to connect to a Pod port. Does anyone know how to do something like that?
Thanks!...