Our Cloudflare Worker (backed by
Our Cloudflare Worker (backed by Hyperdrive) had a big spike in Errors and Wall Time today starting at around 10:30am PT today. On Hyperdrive, I don't see any spikes in latency, but I did see a couple errors in each of our Hyperdrive instances at ~10:30am PT. Struggling a little with how to debug or fix this - most traffic is fine, but our P999 wall time jumped to 70k ms. All of our backing databases are completely normal, and I'm able to query Hyperdrive normally locally
10 Replies
Hey there do you currently still see this big spike in hyperdrive errors, or has it resolved? Also what errors do you see?
I just saw 1-3 errors in each of our Hyperdrive instances (and I'm not sure how to see what the errors were). On our Worker overall, we see these ongoing spikes in Wall Time that we can't really debug
Client disconnected errors seem to be trending down but not 0


In case it's useful:
- Account ID: cf4bd8e45a557fecf50a1b2af74b8453
- Worker name: spindl-adserver
- Hyperdrives: 5dbce450a32a4279a8cdf3a8596ee308, 4d4fe23ab437464f86f59bb8ed897e88, dc0758eacd8945a4b2bea40cb1654223
Thank you! i'll look into this and let you know what I find
One other thing I just enabled to try to see if I could get out of these infinite timeouts is setting a Postgres statement timeout:
ALTER ROLE <role> SET statement_timeout='10s';
Though I think it will only kick in on new connections, not sure how often the connections are refreshed
So far all I see is that some connection disconnects occurred which is nothing too out of the ordinary. I'll keep looking around to see if I notice anything else from hyperdrive side
Two observations from our side:
1. The high wall time spikes seems to correlate to Hyperdrive issues. We saw the same spike on Saturday when us-east-1 databases were broken from Hyperdrive. On that incident, we didn't see any changes in the Hyperdrive metrics (latency or error rate) but of course saw a lot of errors and walltime spikes
2. These client disconnect errors that we see what Hyperdrive fails don't trigger our Sentry alerts. I'm not sure why - it could be that the way that error presents itself, our in-worker Sentry integration doesn't trigger or doesn't get a chance to drain
Unknown User•3w ago
Message Not Public
Sign In & Join Server To View
do you have a connection timeout configured for your database driverGot it, thanks. Excited for the visibility work! On the connection timeout - Is this on the level of Postgres (like statement_timeout), or the client side library (we use the Node library 'pg')?
Unknown User•3w ago
Message Not Public
Sign In & Join Server To View