Would it be possible to have the runtime

Would it be possible to have the runtime store only the latest N WebSockets, and then GC the rest?
16 Replies
Milan
Milan13mo ago
This might be possible, though I don't know enough about if we can force JS to GC an object that still has references (I suspect no and also we shouldn't do that). Ex. Req1 broadcasts to all currently connected websockets (got them via getWebSockets()), then does an await and yields event loop. Req2...ReqN create new websockets and those get used a bit, so we evict all WS prior to Req2. When Req1 is done its IO and is scheduled to run again, when it tries to refer to its array of websockets it got from getWebSockets(), something weird would happen Would also have to consider different eviction policies (and if that should be configurable by developers)
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
Wait, so if I getWebSockets(), use the WebSocket once, then move out of scope, is the WebSocket Object then garbage collected, or is it stored somewhere else too?
Milan
Milan13mo ago
We have another reference to the JS WebSocket elsewhere
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
Can you track the active references within user code, then evict if they are only referenced in the backend? I assume the backend can recover more easily from no longer having an active object reference
Milan
Milan13mo ago
It's a good question, I was thinking the same (a lot of the C++ we wrote for hibernation does ref counting -- specifically knowing if we are holding the only remaining strong ref to some object). I'm not sure we have a way to know that for our JS types today, briefly skimming it looks like we don't. That might be something we could add, but between modifying refcounting + having some type of eviction promise/loop it would be a pretty significant project and would make the websocket/hibernation code (which is already complicated) hard to maintain. We probably need to refactor the code + finish outgoing hibernation before considering something like this. Tinkering with object lifetimes is one of the more risky things we do on the runtime.
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
Inversely, what about a function on state that signals to the runtime that it is ok for the DO to be restarted? It would at least solve the growing number of objects problem, and might be useful for user code to be able to "clean up" if it generates a lot of in-memory state
Milan
Milan13mo ago
Or even delete them when they are added, and only recreate them for the time they need to be processed?
Typically folks will want to do some stuff with the object immediately after accepting it, so it's not clear at what point it would make sense to do this. I suspect if anything what you suggested before (evicting on the backend if there's no user references) makes most sense
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
Yeah sorry, I was kind of thinking about it the wrong way. There, I meant delete them after they are added(but no longer have references in user space). It would also allow update-related restarts to occur at a time most useful for the program. It wouldn't entirely prevent forced-evictions, but it could help some, no?
Milan
Milan13mo ago
signals to the runtime that it is ok for the DO to be restarted
As in, signals that after this current request is finished you can immediately evict the instance for hibernation?
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
That, or a "I've reached a state that I don't think I can recover from, so evict the DO now" Maybe a toggle inside, as a boolean, or something Actually, never mind. Eviction at end of current requests would probably make more sense Also so that another request in progress doesn't get cut off Something like
// state.evict stops further events from entering the queue, but
// does not prevent current events from being processed.
this.state.evict(() => {
// This function is run after all events have completed, but before the DO is evicted.
// Could be used for final cleanup tasks
});
return new Response(null);
// state.evict stops further events from entering the queue, but
// does not prevent current events from being processed.
this.state.evict(() => {
// This function is run after all events have completed, but before the DO is evicted.
// Could be used for final cleanup tasks
});
return new Response(null);
Milan
Milan13mo ago
That's come up before but my understanding is there are enough pitfalls that we have decided to punt it indefinitely
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
Just curious, if you are able to speak about what issues might occur?
Milan
Milan13mo ago
Mostly just issues around deterministic behavior. It would be misleading to suggest that this runs when your DO is evicted because eviction could mean anything from: you breached a runtime limit to the machine no longer exists. It would really only be useful in the context of a shutdown due to inactivity. Not saying this isn't useful (I like destructors!) but it's fairly limited relative to what people would expect, and even in the clean shutdown case we have to consider things like - how long it can run for - can it do IO - should we cancel shutdowns if new requests come in (does that mean your shutdown procedure has to be a transaction)
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
That all makes sense
Milan
Milan13mo ago
To summarize, I'll open a ticket internally for evicting inactive hibernatable websockets and write some thoughts so it doesn't get lost here. If you're interested, feel free to open a discussion on the Workerd repo too
Hello, I’m Allie!
Hello, I’m Allie!OP13mo ago
GitHub
Hibernation API Memory Management · cloudflare workerd · Discussion...
It was pointed out to me today by @MellowYarker that using the Hibernatable WebSocket API does not preclude running out of memory due to WebSocket objects not being automatically deleted when no lo...

Did you find this page helpful?