June Thai
Cannot SSH over exposed TCP (multiple pods, tested from different local machine)
Hi @here I cannot SSH over TCP but is able to do basic. I suspected my Docker at first, but I have the same issue with multiple Docker image. I tested it from multiple local machine.
This is the verbosed error message:
debug1: Reading configuration data ~/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 21: include /etc/ssh/ssh_config.d/* matched no files
debug1: /etc/ssh/ssh_config line 54: Applying options for *
debug2: resolve_canonicalize: hostname xxx.xxx... is address
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts' -> '~/.ssh/known_hosts'
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts2' -> '~/.ssh/known_hosts2'
debug1: Authenticator provider $SSH_SK_PROVIDER did not resolve; disabling
debug3: channel_clear_timeouts: clearing
debug3: ssh_connect_direct: entering
debug1: Connecting to xxx.xxx... [xxx.xxx...] port 13454.
debug3: set_sock_tos: set socket 3 IP_TOS 0x48
debug1: connect to address xxx.xxx... port 13454: Connection refused
ssh: connect to host xxx.xxx... port 13454: Connection refused
12 replies
Unable to start pod with llm-foundry image
I'm trying to launch a pod with llm-foundry https://github.com/mosaicml/llm-foundry/tree/main?tab=readme-ov-file#mosaicml-docker-images but the Pod stuck in initialization without error messages.
8 replies