Kushagra
Kushagra
RRunPod
Created by Kushagra on 8/8/2024 in #⚡|serverless
Copy Network volume contents to another.
What is the way to copy one network volume content to another network volume.?
3 replies
RRunPod
Created by Kushagra on 8/5/2024 in #⚡|serverless
A100 80GB GPUs unavailable
Hello Team, We have multiple production endpoints that use A100 80GB serverless. Suddenly all the endpoints A100 and H100 are unavailable. Is there any maintenance work going on?
1 replies
RRunPod
Created by Kushagra on 7/30/2024 in #⚡|serverless
New release is taking too long.
No description
7 replies
RRunPod
Created by Kushagra on 7/30/2024 in #⛅|pods
Error after restarting the containers.
Command : docker compose up Error: WARN[2024-07-30T12:12:22.042930970Z] Controller.NewNetwork mia-runpod-backend_default: error="failed to create DOCKER-USER IPV6 chain: iptables [+] Running 3/4es --wait -t filter -N DOCKER-USER: ip6tables v1.8.4 (legacy): can't initialize ip6tables table `filter': Table does not exist (do
3 replies
RRunPod
Created by Kushagra on 7/30/2024 in #⚡|serverless
Error response from daemon: Container is not paused.
Hello Team, After deploying a new docker image on a serverless endpoint I am getting the below errors in my system log: 024-07-30T11:56:27Z error starting: Error response from daemon: Container 2a638b70551885c464f48892d2d0fc9eed7eb590fbda42b33841d7e84b23b307 is not paused Can someone please help me this?
1 replies
RRunPod
Created by Kushagra on 7/23/2024 in #⛅|pods
Error response from daemon: driver failed external connectivity on endpoint.
Suddenly I am getting below error when I try to docker compose up The Docker was working fine on the pod. I just made some code changes and rebuilt it and now I getting below errors: Gracefully stopping... (press Ctrl+C again to force) Error response from daemon: driver failed programming external connectivity on endpoint mia-runpod-backend-engine-1 (f4a69cb1cbf0100d22af23c3d5dc5a09aeeac3425476d4bc8bfbf886e42a77f1): Unable to enable MASQUERADE rule: (iptables failed: iptables --wait -t nat -A POSTROUTING -p tcp -s 172.19.0.4 -d 172.19.0.4 --dport 8000 -j MASQUERADE: /usr/sbin/iptables: error while loading shared libraries: libip4tc.so.2: cannot close file descriptor: Error 24 (exit status 127))
8 replies
RRunPod
Created by Kushagra on 7/18/2024 in #⚡|serverless
Lightweight docker image for inference generation.
Hello All, I am currently using pytorch/pytorch:2.2.1-cuda12.1-cudnn8-runtime image for my servreless endpoint. The issue is that my Github action to build and push the docker image fails due ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device Is there any recommended lightweight docker image that I can use?
7 replies
RRunPod
Created by Kushagra on 7/17/2024 in #⚡|serverless
How to update a serverless endpoint with a new version of the docker image?
When we push the new version image of a docker image in the docker hub, does the serverless endpoint automatically update the workers, or do we need to publish the new image manually on the endpoint?
8 replies
RRunPod
Created by Kushagra on 7/15/2024 in #⚡|serverless
How to use a volume with serverless endpoints?
Hello All, We have multiple serverless endpoints that downloads the model and generate the inference. Is there is a way to mount a common volume to all the serverless endpoint system. We don't want to down the model every time endpoint boots up. It would be nice if you can please share a concrete example
8 replies