DB Latency
Hi everyone. On our Coder instance, we've been experiencing Coder-DB latency that triggers disconnects on workspaces. I believe we have an overload of database operations that are flooding postgres. The usual log we see is this:
2025-01-23 18:47:21.790 [info] coderd: disconnected possibly outdated agent workspace_id=xxxx agent_id=xxxx request_id=xxxx ...
error= fetch object:
github.com/coder/coder/v2/coderd/database/dbauthz.(*querier).GetWorkspaceByID.fetch[...].func1
/home/runner/work/coder/coder/coderd/database/dbauthz/dbauthz.go:522
- context canceled
I have a couple of questions here:
1. How do you usually measure DB workload? With Prometheus I can see active connections and max transaction duration, but they don't seem that high.
2. Is this issue somehow related to this behavior? https://github.com/coder/coder/issues/15082 @DanielleGitHub
Reduce DB load of autobuild · Issue #15082 · coder/coder
Motivation The autobuild/ package periodically queries for workspaces that are eligible for a state transition via GetWorkspacesEligibleForTransition. This is run every CODER_AUTOBUILD_POLL_INTERVA...
3 Replies