jojje Comments - Answer Overflow

jojje

•Created by ጠዪ_ጠዐዐክነዘጎክቿ on 4/2/2025 in #⛅｜pods-clusters

SSH over exposed TCP connection refused

Use the -v flag when connecting via ssh. That will tell you precisely how the connection is going, if it finds the right key, or if it's even able to establish a TCP connection at all. E.g. ssh -v -p <pod-ssh-port> root@<pod-ip> If you're using the runpod CLI (runpodctl), then it will provide you the proper ssh connection string automatically runpodctl ssh connect

8 replies

RRunPod

•Created by blistick on 1/5/2024 in #⚡｜serverless

What does "throttled" mean?

I think it's the word choice of "throttled" that causes the confusion, since runpod is hijacking an established term having the defacto meaning (semantic) "mitigation of user induced policy violation" to instead mean "Pending" or "Queued", which means "waiting for the requisite (resource) condition to execute a task". If they'd used any of the latter terms I expect there wouldn't be nearly as many questions about serverless throttling.

16 replies

RRunPod

•Created by blistick on 1/5/2024 in #⚡｜serverless

What does "throttled" mean?

@justin [Not Staff] are the "workers" bound to a specific data center (region) ? If not, then I don't see why adding more workers would help since the situation of the requested GPUs wouldn't change one iota. They'd all just be throttled as well, for the same reason the initial ones were. But if a worker is pegged to a specific colo, then it would make sense as the resource horizon would be limited to that single colo. Do you know which of these hold true for workers? (DC pegged at creation, or whether worker pegging happens once a resource match has been found)

16 replies

RRunPod

•Created by Xeverian on 1/19/2025 in #⛅｜pods-clusters

Deploy custom private docker image

@nerdylive hah. Well, Sonnet did that (the screenshot transcription). Didn't think one could paste images into discord chats.

17 replies

RRunPod

•Created by Xeverian on 1/19/2025 in #⛅｜pods-clusters

Deploy custom private docker image

3. Finally at Runpod, going to Account > Settings > Container Registry Auth and adding the credential such as:

Create New Registry Credential
-----------------------------

┌─ Credential Name ────────────────────────────────┐
│ dockerhub-read                                   │
└──────────────────────────────────────────────────┘

┌─ Username ───────────────────────────────────────┐
│ <username>                                       │
└──────────────────────────────────────────────────┘

┌─ Password ───────────────────────────────────────┐
│ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●                👁 │
└──────────────────────────────────────────────────┘

                                    [Save]  [Cancel]

Create New Registry Credential
-----------------------------

┌─ Credential Name ────────────────────────────────┐
│ dockerhub-read                                   │
└──────────────────────────────────────────────────┘

┌─ Username ───────────────────────────────────────┐
│ <username>                                       │
└──────────────────────────────────────────────────┘

┌─ Password ───────────────────────────────────────┐
│ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●                👁 │
└──────────────────────────────────────────────────┘

                                    [Save]  [Cancel]

4. When launching pods, select/attach that named credential (dockerhub-read in the example above), else the pod won't actually use it when pulling the image.

17 replies

RRunPod

•Created by Xeverian on 1/19/2025 in #⛅｜pods-clusters

Deploy custom private docker image

Did you follow these steps? 1. Creating a access token in Dockerhub for runpod

Settings / Personal access tokens / New access token
----------------------------------------------

Create access token
------------------
A personal access token is similar to a password except you can have many tokens and
revoke access to each one at any time. Learn more ↗

┌─ Access token description ────────────────────┐
│ for-runpod                                    │
└───────────────────────────────────────────────┘

┌─ Expiration date ──────────────────────────┐
│ None                                    ▼  │
└────────────────────────────────────────────┘

Optional
┌─ Access permissions ────────────────────────┐
│ Read-only                                ▼  │
└─────────────────────────────────────────────┘

Read-only tokens allow you to view, search, and pull images from any public repositories
and any private repositories that you have access to.

[Cancel]    [Generate]

Settings / Personal access tokens / New access token
----------------------------------------------

Create access token
------------------
A personal access token is similar to a password except you can have many tokens and
revoke access to each one at any time. Learn more ↗

┌─ Access token description ────────────────────┐
│ for-runpod                                    │
└───────────────────────────────────────────────┘

┌─ Expiration date ──────────────────────────┐
│ None                                    ▼  │
└────────────────────────────────────────────┘

Optional
┌─ Access permissions ────────────────────────┐
│ Read-only                                ▼  │
└─────────────────────────────────────────────┘

Read-only tokens allow you to view, search, and pull images from any public repositories
and any private repositories that you have access to.

[Cancel]    [Generate]

2. Copy the credential (docker password) from the form that follows then you click Generate

Settings / Personal access tokens / New access token
----------------------------------------------

Copy access token
----------------
Use this token as a password when you sign in from the Docker CLI client. Learn more ↗

Make sure you copy your personal access token now. Your personal access token is only
displayed once. It isn't stored and can't be retrieved later.

Access token description
for-runpod

Expires on
Never

Access permissions
Read-only

To use the access token from your Docker CLI client:

1. Run
$ docker login -u <username>                           [Copy]

2. At the password prompt, enter the personal access token.
dckr_pat_XXXXXXXXXXXXXXXXXX                            [Copy]

Settings / Personal access tokens / New access token
----------------------------------------------

Copy access token
----------------
Use this token as a password when you sign in from the Docker CLI client. Learn more ↗

Make sure you copy your personal access token now. Your personal access token is only
displayed once. It isn't stored and can't be retrieved later.

Access token description
for-runpod

Expires on
Never

Access permissions
Read-only

To use the access token from your Docker CLI client:

1. Run
$ docker login -u <username>                           [Copy]

2. At the password prompt, enter the personal access token.
dckr_pat_XXXXXXXXXXXXXXXXXX                            [Copy]

17 replies

RRunPod

•Created by rahul on 1/23/2025 in #⛅｜pods-clusters

My pod is taking forever to download the image

Thanks. Did they say why? I'm asking because it's faster than dockerhub for me in the US. Edit: I followed your suggestion @rahul and pushed a copy to dockerhub. The download from there was almost instant. Thanks for this tidbit! So it seem there is some serious issue between GH and Runpod, and it would be good to learn why. But my problem is at least resolved for now. Thx again.

9 replies

RRunPod

•Created by Dataman223 on 1/16/2025 in #⛅｜pods-clusters

pods just keep stopping without any reason why when downloading?

Seems most plausible. Given the 20Gbps speed, that file seems already cached in runpod's datacenter reverse network proxy. Had it not been cached there; fetching directly from huggingface, it could have been an unlucky draw with a network connection to the HF CDN, at which time one just has to redo the download to get a "healthy" link. But in this case, the "luck of the draw" wrt connections seems far fetched since the in-DC cache should not have this problem. At least I've never encountered one.

6 replies

RRunPod

•Created by rahul on 1/23/2025 in #⛅｜pods-clusters

My pod is taking forever to download the image

Same issue here. My images are hosted on gcr.io (github). This has been a problem in the RO region for the past few days I tried. In US regions with H100 etc, the transfer is blazingly (~300-400 MB/s) quick But in RO it's slow as molasses. Was getting around 2-3 MB/s for 2000-Ada. Switched to 4000-Ada hoping that machine had better connectivity, but zero difference. At this rate It'll take forever to download the 14GB image (with all the ML libs needed for the workload). The NVIDIA CUDA layers alone are > 10 GB. Please someone from Runpod, can you shed some light on the docker layer-pull issue in general? In particular for my issue, Is the network link / path between RO and Github limited in some way? If so, what EU regions would you recommend instead? PS. I'm using "secure cloud" if that wasn't self-evident.

9 replies

RRunPod

•Created by denise on 10/3/2024 in #⛅｜pods-clusters

any way to control the restart policy of pods?

I interpreted OP to mean they want to avoid expensive and unproductive infinite crash-loops. Or perhaps your remark nerdylive, was also questioning the current behavior?

7 replies

RRunPod

•Created by jojje on 10/4/2024 in #⛅｜pods-clusters

Runpod API documentation

Great. Your AI helpbot revealed the GraphQL API is documented here: https://graphql-spec.runpod.io/ (for anyone else looking) So that means the question can be narrowed to just the REST API.

4 replies

RRunPod

•Created by denise on 10/3/2024 in #⛅｜pods-clusters

any way to control the restart policy of pods?

Haven't found one. So in order to avoid crash-loops, I just wrap all my containers in an init script that just execs into a "wait" process and launches all its actual work in sub-processes. That way I can see any errors in the logs, debug and fix stuff that is broken without the frigging container vanishing in a puff a smoke a second after an error happens. The always-restart policy is only really useful for stable production workloads. Not for R&D or experimental setups, which is all I'm using runpod for.

7 replies

Gaming

Programming