WebGL + Canvas randomly going black [Helldive Difficulty]
TLDR: Can WebGL resources automatically be disposed by the brower/system without warning?
I'm in a bit of a pickle. I have a Svelte project that acts as an image viewer and it uses an external library for rendering the images to a
<canvas />
with WebGL. The user can pan, zoom, draw annotations, change brightness and contrast, etc. The problem is a small number of users are reporting an issue where the "viewer" randomly goes black and nothing they do will make the images appear again besides doing a hard refresh of the page. This isn't ideal because they lose the state of the images they had open + any effects they've done to it like changing the brightness/contrast.
The big problem is this is happening silently. There are no errors or anything (that I know of) that are happening to cause this. The users are in clinical offices, so my hunch is that it is related to them leaving the website open for extended periods of time without touching the machine. I expect that their computers are going to sleep or the browser/system is cleaning up resources automatically since they aren't being used. I need help from anyone who is experienced with WebGL because I can't find any information on if this is a thing that can happen. The reason why I'm leaning towards this though is the WebGL stuff happens in "a different world" than the main thread it's the only thing that could disappear on it's own. Either way what the users are reporting is that it just goes black. It's failing silently with no errors. On my end I have no code that automatically does things on it's own. No code is messing with anything regarding the images/data array for the canvas/etc on it's own. It's only in response to user interactions. From what they are reporting it seems like there is no user interaction and that they just come back to the viewer being black.
Some of the images that are rendered aren't simple jpegs or pngs, they are DICOM files. These DICOM files can be very large from a few megabytes to hundreds. They can contain "slices" or "frames" so they have 100+ images inside them. The way it works in the viewer is the user can ctrl + mouse wheel to scroll through the slices/frames. Each one gets cached so that they can scroll back to previously viewed frames and it will instantly load. Most of this functionality is handled within the external library. From digging in the source code and reading the (very incomplete) docs for the external library, it automatically handles removing items from the cache when the memory usage hits a certain limit.
I cannot reproduce this behavior locally. There isn't any kind of client interactions like opening and closing images over and over + using different image manipulation tools that cause it to happen. Add this to there not being any kind of errors happening the only thing I have to work off of is assumptions.
My bandaid solution for now is I added an inactivity tracker where if the window receives no events after ~30 seconds you are considered "inactive" using Huntabyte's runed library https://runed.dev/docs/utilities/is-idle. When the user is considered idle/inactive it triggers a svelte $effect()
that iterates over all open "viewports", gets the canvas context, then iterates over the imageData array and checks if every pixel is 0,0,0,255
(all black). If every pixel in the canvas is black it will force re-mount the svelte component for the viewer. This will ensure that the whole loading process starts fresh. I have all this logic in a setInterval()
that repeats every 5 seconds. So while the user is idle it will monitor the open viewports to check if any of them randomly go black. I also have a condition for if the user becomes active again it will do that check one last time. This is because the user can go inactive, then become active again before the first setInterval()
call runs.1 Reply
I'm not sure if my bandaid solution will end up working, we will have to see next week if that small collection of users are still experiencing the problem. Even if it does fix it, it doesn't sit easy with me that I don't know what could be causing this and not actually fixing it. There is an edge case where if the image they are viewing happens to be an all black image they could get themselves in an infinite loop 😅 . Although it's a very slow infinite loop since it will only remount the component every 30 seconds and it would only happen as long as they have that image open lol. It's highly unlikely that any of our users would have uploaded images that are just black and nothing else, but it's still an edge case that I want to handle. I'm not exactly sure how to handle it though because I'm not sure if the data array for the image (that I can access through the external library) is randomly disappearing or if it's somewhere deeper. The external library uses VTK JS in the background and any of those things would be very hard to get access to because the library abstracts it out so heavily. So my only choice seemed to be looking at the canvas element and going through every pixel in the imageData because that's all I really have access to.
This is a pretty complicated scenario so any WebGL experts opinion on this would be greatly appreciated. I really just want to know if my hunch is correct where going idle to the point of your PC going to sleep, or having the tab open but not active for extended periods of time could result in WebGL resources automatically being cleaned up without any warning.
This is getting really long winded so I'm not sure if anyone will read all of this, but I have 1 more important point. The way the external library seems to handle rendering is that it has the concept of a
RenderingEngine
. You can have many images per rendering engine and it seems to create a large atlas map of each "viewport" with separate cameras in WebGL land. Somehow this gets projected to the <canvas />
on the DOM. However, the <canvas />
on the DOM does not itself have a WebGL context, I've tried console logging every type of canvas context you can get and the one that's on the DOM only has the regular CanvasRenderingContext2D
. So somehow the library makes that connection on it's own. This means I can't listen for the webglcontextlost
event because the <canvas />
I have access to never gets one. I could be completely wrong about this but it's just what I'm experiencing so idk. Hopefully that adds some context to what could be going on here for anyone with a lot of experience with this stuff.