RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

Unable to start pod using GraphQL

I am trying to create a pod using the GraphQL endpoint but I am getting 400 status response, here are the request and response for the same. Please let me know how to get this working. ``` Sending GraphQL query: mutation {...

Differentiating between the pod state, "starting" vs "stopping"

When I start a pod and fetch it's details through the grapQL api, the "runtime" is None but when I stop it, the "runtime" is None as well. Is there a way to differentiate between these two states ?

Building and deploying dockerfile from Pod

Has anyone figured out how to properly build and push a dockerfile in a runpod pod? https://docs.runpod.io/tutorials/pods/build-docker-images I'm trying to do this for my custom serverless worker bc my personal pc has some docker issues that i havent figured out over multiple days of debugging (i think wsl got corrupted somehow but that's a different issue). But every time i try to run bazel run //:push_custom_image, it shows the following error: ``` WARNING: Target pattern parsing failed....

Still waiting for logs but I can Console in?

My container is still in Waiting For Logs state, but I can access it through the web console, run services, and access them through http. Even after doing this, it is still showing waiting for logs. The docker entrypoint script does not seem to have completed as the services it should run are not started. What logs does the container logs on runpod ui actually look in? Is it what the container prints to the screen or a log file?...

price

from which point of the provision of the pod in getting credited?

Assistance with Deploying AI App on RunPod

I recently purchased RunPod to deploy my AI app, but I could use some guidance on implementing my end-to-end project. I have a project folder, "X," that includes my custom models (in both ONNX and PyTorch formats) and Flask APIs. It’s working well locally, but I'm a bit confused about transitioning to RunPod. Specifically, I’m unsure about how to best leverage Pods, serverless options, and templates to set it up on your platform. I've explored the documentation but still have questions on structuring and deploying it effectively. Could you or someone from your team provide guidance or resources to help me set up and run my project on RunPod?...

cant ssh to runpod

-- RUNPOD.IO -- Enjoy your Pod #1r3czjoca6n3zh ^_^ Error response from daemon: Container b7b51b7b9a1b7e03f346032b3339de15d9465317e632972e2b5be6b3584d8759 is not running...

How fast are network volumes?

Hey, this kind of belongs into here, and also into serverless. For a client we're currently architecting some stuff, and the question we're having is, just exactly how fast are network volumes. My limited benchmarks make it feel like they add about a minute for an SDXL Model to the execution time, because the Model needs to be loaded to RAM. This seems to be much faster with local storage. What are your experiences, any pieces of advice? Any gotchas? Thank you so much 🙂 ...

Please help me.

I have to deploy backend that built using flask on VPS with GPU. The backend performs object detection using YOLO. How to do it on RunPod? And what is this error?...

very slow network storage

I deploy pod with network storage on US-KS-2 and its extremally slow (storage disk)
pyton -m venv venv
pyton -m venv venv
...

Pod not starting up properly anymore

When I deploy a pod with the "RunPod Stable Diffusion" template on demand, its not starting up properly, even if I wait for an hour. I can not launch jupyterlab or the sd webui. Did something change with the platform? This used to be a very easy and straightforward process withouth issues....

Putty for SSH? Any clues?

I'm trying to connect with putty. I think I converted the ed25519 to .ppk correctly. I know how to use it to authenticate. I can connect and get prompted to "login as:", so I just use "root" because I can't find a username in documentation. Which username do we use? Am I doing this right?...

Unable to Connect AWS to RunPod

I am encountering an issue while attempting to connect AWS to RunPod. Despite multiple attempts, the connection fails, and we have been unable to establish a successful link between the two services. Any guidance or troubleshooting steps to fix this issue would be greatly appreciated. Thank you in advance for your support!

External IP Ranges (for an AWS VPC Security Group

I've been using an RDS database to collate the results of the work my pods are doing, and that's been fine so far with one or two running, but we're about to scale to a lot more. This kind of means that I can no longer log into a pod and ping something to get it's IP address to add to my RDS VPC Inbound Whitelist. I was looking at maybe AWS PrivateLink or mTLS, but neither seem to be supported. ...

how can i access network volume from jupyterlab notebook ?

I am currently running a test on some LLM models, and currently trying to setup a network volume so that I am download and use some of the larger models while also working on some other embbeding models as well (not able to download both llm and embedding model into the defualt volume at the same time) Would like to ask how can I move the model to the network volume so that I wont have running out of volume error. Thanks!...
No description

I need to reinstall the pip requirements for comfyui everytime I start a pod.

I need to reinstall the pip requirements for comfyui everytime I start a pod. I understand there is a difference between disk and pod volume. I assume I have to update the correct one. So which one do I need to update to permanentely update comfyui and how do I do that? I already tried solving this with the ask-ai bot without success. Thank you.
Solution:

RTX 6000 Ada pods breaking

I am using the ULTIMATE Stable Diffusion Kohya ComfyUI InvokeAI template on multiple RTX 6000 Ada pods. Lately I have run into many issues with the pods breaking while using ComfyUI. The latest incident occurred shortly after loading multiple IPAdapter models through the Model Manager. After loading the models, the ComfyUI page froze, and then gave me an error saying "Error loading workflows: Unexpected token '<", "<!DOCTYPE "... is not valid JSON". The previous incident occurred directly after loading a workflow JSON into ComfyUI. Same symptoms and issues. I am unsure how the two are connected. ...
No description

how can i shrink my volume size?

I increased my volume size for some work and now there is error saying you cannot decrease the volume size. I deleted all the files to free up space and now i do not want to pay for the whole 100gb volume....

Can I use the filtering syntax when calling myself query?

query Pods { myself { pods(filter: { name: { startsWith: "basic::" } }){ id name...

How to connect to SFTP via rclone/Fuse

I've got a question which prevents me from using Runpod for my use-case at the moment: I'd like to connect Runpod to an SFTP server, and mount the corresponding remote volume to the local filesystem of the pod container - so it can be used like any other directory by apps like blender. The way to do this usually is to use rclone, or a Fuse mount. This would let me connect multiple pods to a folder for them to write - I know I can do this with a runpod network folder - but also to read from that SFTP drive, which is a local NAS in our office on which we update regularly a lot of massive files, in order to avoid doing the sync manually. Crawling through the doc, for now I can see how to connect TO the pod via SSH, export data snapshots to S3 & co and use network volume. I've come across the issue below that since "Fuse is not supported by Runpod because it requires granting privileges to the container. Since Fuse is a kernel module, it needs to be supported by the host". Another option would be to use a cloud sync utility like dropbox/gdrive, but this would involve an additional data-transfer from our office volumes to the cloud. Would love to know if there is a workaround from your team!...