Coder.com•3y ago

SSH in k8s install not working

Hi, I've gotten coder working in my k8s cluster, which is an experimental cluster on a number of VPSes. I can connect to it/ log in via the access URL I've created (i.e. coder.atsomedomain.com) and via the CLI I have installed locally. I've uploaded a template and can create a workspace. However, the port forward URL and SSH aren't working. The SSH connection times out it seems. I've read through a ton of issues and threads here and I believe this might be an issue with how I have ingress set up. My initial question is, can someone explain how the connectivity is supposed to happen between the client/ user and the workspace via SSH? Is port 22 necessary by chance? From what I was reading, it isn't. Yet, SSH isn't working. Any tips on how to trouble shoot this problem would be greatly appreciated.

38 Replies

ScottOP•3y ago

Ok. I reread the Architecture section of the docs and the agents in the workspaces are tunnelled via Tailscale/Wireguard. I'm guessing that is how the SSH is happening. What I'm not understanding is how that connection can happen. No service(s) is/are set up for the workspace pod(s). 🤔

Phorcys•3y ago

so, coder ssh is not "real SSH", it's SSH over WebSocket on the client side, on the workspace side, your template should install the coder agent and that agent communicates with the server using DERP (lirc), could you show me your template? usually, the issue either comes from there or some k8s conf but I don't have much experience using k8s could you check if your coder workspace can communicate with the server using some kubernetes cli (i'm thinking of a feature like docker exec where you could run a ping/curl to see if traffic goes through)

ScottOP•3y ago

Hey. Thanks for replying. The template I used is the example one here: https://github.com/coder/coder/tree/main/examples/templates/kubernetes I can exec into the workspace and server pods. I can also share the logs each pod puts out.

Phorcys•3y ago

that could come in handy yeah

ScottOP•3y ago

The only change I made was upping the version to the newest in the coder_agent resource.

Phorcys•3y ago

that shouldn't cause issues at all

ScottOP•3y ago

That is what I thought too.

Phorcys•3y ago

alright, I'll have to go in a bit for some hours, I'll check when I come back

ScottOP•3y ago

What I'm uncertain about is how do the workspace agents get a line of communication to the outside world? From my understanding, for the code-server, it goes through the actual code server. But, the agents and SSH, how should they connect? I appreciate any help I can get. Thanks! and CU! 🙂

ScottOP•3y ago

This is the output from the workspace pod.

message.txt

ScottOP•3y ago

I've been reading about Tailscale. Can I assume correctly that a Tailscale server is built into the coder server v2? I noticed in the Tailscale docs, that to use coder as a client would need a Tailscale registration (which would be ridiculous). Maybe their docs are very outdated? Oh. Ok. I'm getting things mixed up. The Tailscale docs speak about code-server, not coder.

ScottOP•3y ago

What I'm missing, conceptually, is how this part of the communication is supposed to happen between the client and the coder agent.

ScottOP•3y ago

Cause, according to Tailscale, there are a couple of methods to get the Tailnet going in k8s. However, it seems none of them are happening with the k8s Terraform example template i.e. no service being exposed. No sidecar being added. No Proxy pod being added. etc.

Phorcys•3y ago

AFAIK coder client <-> coder server <-> coder agent on workspace although I am not sure when tailscale comes into play btw agent and SSH are the same thing (SSH is built in to the agent)

ScottOP•3y ago

@Phorcys

coder client <-> coder server <-> coder agent on workspace

Currently that is the only way it can happen, if that's what is what happens. LOL! 😄 That would also mean the diagram above is conceptually incorrect.

btw agent and SSH are the same thing (SSH is built in to the agent)

That is what I understand too, but is it peer to peer to the user? Or is the SSH tunnel proxied by coderd?

Phorcys•3y ago

proxied by coder client <-> workspace is always proxied

ScottOP•3y ago

Ok. That means my ingress isn't allowing the TCP connectivity. As it needs to be. I think. Currently, I'm only allowing HTTPS to go through. It upgrades to websocket, but that doesn't seem to be enough.

Phorcys•3y ago

it should use http I think I'm unsure

ScottOP•3y ago

I'm still not sure what to do now. I'm totally stuck and the SSH capability is a requirement we have. ☹️

kyle•3y ago

I'll help! Wanna hop in Discord?

ScottOP•3y ago

Hi Kyle. What do you mean by "hop in Discord"?

kyle•3y ago

I meant that we could hop into a voice channel if you'd like, but I'm happy to debug here too if you're around!

ScottOP•3y ago

I am. 🙂

kyle•3y ago

Just to confirm from the discussion, are you allowing TCP traffic to Coder pods?

ScottOP•3y ago

Might have to go soon though for dinner.

kyle•3y ago

All good! If you have to leave just ping me. And are you proxying HTTPS at all?

ScottOP•3y ago

So, currently I have Ambassador Emissary (edge-stack) as my ingress. I have a wild-card SSL setup for my URL and am also using a sub-domain to connect to the coder server pod. Everything works, just SSH doesn't.

kyle•3y ago

That's almost certainly because of how SSH works with DERP. It uses a custom Connection-Upgrade: DERP header that allows DERP to communicate directly over the socket. It can use a WebSocket as well, but we don't have an option that exposes that at the moment.

kyle•3y ago

https://www.getambassador.io/docs/emissary/latest/howtos/websockets Try adding derp to the allow_upgrade config.

WebSocket connections | Ambassador

WebSocket connections Emissary-ingress makes it easy to access your services from outside your application, and this includes services that use WebSocke…

ScottOP•3y ago

I'm going to say... it can't be that easy. Just a sec... LOL! 😄

kyle•3y ago

;p We should return a good error message when the connection upgrade fails anyways, so I'll get a fix in for that.

ScottOP•3y ago