Hey all - I'm having quite a lot of trouble with WebSockets and Durable Objects in production. I bel
Hey all - I'm having quite a lot of trouble with WebSockets and Durable Objects in production. I believe I've narrowed the cause down to a KV put (not transactional storage). With it disabled, it seems to work decent enough, but with it enabled, it causes the websocket message processing to hang. It also seems to be getting worse over time. This is not the case when running locally.
13 Replies
does durable object ever lose its state ?
In memory state, yes. What you stored, however, should not be lost
even when being redeployed?
yup
wtf
i just dont get why its cheaper than KV if i can use it as a KV
Because a single KV namespace can handle millions of requests a second. A single DO can’t
I decided to add a
deleteOldConnections
method that runs every 5 minutes with the alarm feature so if a Pool crashes the old connection records will be cleaned up.Alternatively, you could check for stale connections whenever you receive a new connection? Though I would probably do what you did too.
When you push a WebSocket into the Hibernation API, is there anything actually pushed to storage? If so, would it be possible to, once the runtime/isolate recovers, to fire a
webSocketClose()
handler?Yes, good point. I am now actually running it from the constructor (async with waitUntil) and with the alarm. The alarm is to ensure if the pool doesn't start back up it will be cleaned up eventually.
Good thoughts. Yes, the hibernation API allows storing metadata for each websocket, and I use that to attach an ID to the ws which is what is stored in my
connections
table in D1. The PoolId is the DO Pool's stringified ID and the wsId is a UUID created, so I can find all connections for a pool that don't still exist and delete them.
The webSocketClose
handler should delete them from connections
in regular circumstances. I just needed to make sure the database didn't fill up with orphaned connections when pools die unceremoniously (which shouldn't happen too often)
Thank you for your ideas and help! ❤️Yeah, I'm just not sure that
acceptWebSocket
actually stores anything into storage. Because it isn't async
, I have a feeling that it just stores it in memory, which means if the entire isolate/runtime is evicted, then you lose that stored data too
And if it doesn't store anything to storage, then it wouldn't be able to fire webSocketClose
once it came back upAgreed. That is likely the issue we are seeing with stale connection records and the reason why we need to delete them in a process outside of
webSocketClose
when we detect they no longer correlate to live websocket connections.
I think it is reasonable to assume that these stale connections don't happen in the regular course of the lifecycle, but only on crashes/eviction, so we don't need to be too aggressive in cleanup. And we could probably just check if getWebSockets().length === 0
in the constructor to decide whether to cleanup connections.we store the websockets/serialized attachment outside the isolate but it's still in memory