Hey all - I'm having quite a lot of trouble with WebSockets and Durable Objects in production. I bel

Hey all - I'm having quite a lot of trouble with WebSockets and Durable Objects in production. I believe I've narrowed the cause down to a KV put (not transactional storage). With it disabled, it seems to work decent enough, but with it enabled, it causes the websocket message processing to hang. It also seems to be getting worse over time. This is not the case when running locally.
13 Replies
bun
bun12mo ago
does durable object ever lose its state ?
zegevlier
zegevlier12mo ago
In memory state, yes. What you stored, however, should not be lost
bun
bun12mo ago
even when being redeployed?
zegevlier
zegevlier12mo ago
yup
bun
bun12mo ago
wtf i just dont get why its cheaper than KV if i can use it as a KV
Hello, I’m Allie!
Because a single KV namespace can handle millions of requests a second. A single DO can’t
Jacob Wright
Jacob Wright12mo ago
I decided to add a deleteOldConnections method that runs every 5 minutes with the alarm feature so if a Pool crashes the old connection records will be cleaned up.
Milan
Milan12mo ago
Alternatively, you could check for stale connections whenever you receive a new connection? Though I would probably do what you did too.
Hello, I’m Allie!
When you push a WebSocket into the Hibernation API, is there anything actually pushed to storage? If so, would it be possible to, once the runtime/isolate recovers, to fire a webSocketClose() handler?
Jacob Wright
Jacob Wright12mo ago
Yes, good point. I am now actually running it from the constructor (async with waitUntil) and with the alarm. The alarm is to ensure if the pool doesn't start back up it will be cleaned up eventually. Good thoughts. Yes, the hibernation API allows storing metadata for each websocket, and I use that to attach an ID to the ws which is what is stored in my connections table in D1. The PoolId is the DO Pool's stringified ID and the wsId is a UUID created, so I can find all connections for a pool that don't still exist and delete them. The webSocketClose handler should delete them from connections in regular circumstances. I just needed to make sure the database didn't fill up with orphaned connections when pools die unceremoniously (which shouldn't happen too often) Thank you for your ideas and help! ❤️
Hello, I’m Allie!
Yeah, I'm just not sure that acceptWebSocket actually stores anything into storage. Because it isn't async, I have a feeling that it just stores it in memory, which means if the entire isolate/runtime is evicted, then you lose that stored data too And if it doesn't store anything to storage, then it wouldn't be able to fire webSocketClose once it came back up
Jacob Wright
Jacob Wright12mo ago
Agreed. That is likely the issue we are seeing with stale connection records and the reason why we need to delete them in a process outside of webSocketClose when we detect they no longer correlate to live websocket connections. I think it is reasonable to assume that these stale connections don't happen in the regular course of the lifecycle, but only on crashes/eviction, so we don't need to be too aggressive in cleanup. And we could probably just check if getWebSockets().length === 0 in the constructor to decide whether to cleanup connections.
Milan
Milan12mo ago
we store the websockets/serialized attachment outside the isolate but it's still in memory

Did you find this page helpful?