since 2 days ago I'm getting a ton of

since 2 days ago I'm getting a ton of internal error and Network connection lost. errors when calling my durable objects. they don't have particularly high load and it happens to many different instances, not just one there's no useful error to help me fix it unfortunately
14 Replies
Unknown User
Unknown User8mo ago
Message Not Public
Sign In & Join Server To View
Vincent
Vincent8mo ago
✅ Done!
Vincent
Vincent8mo ago
@Frederik Just sent you all the details. For us, it's mostly the internal error, not the Network connection lost thing.
chronark
chronarkOP8mo ago
sure, 1s
Vincent
Vincent8mo ago
@Frederik Small update here: we investigated a bit more today, and still have no good sense of where those internal error errors are coming from, but… we have noticed that they all don't carry the remote: true field, meaning it's unlikely that they're actually coming from transactional storage APIs—contrary to the initial hunch we had before. If so, they would have been thrown by the Durable Object, in which case we would see these errors having a remote: true property when caught in the worker. So it's more likely that they have a different origin, and maybe the focus on transactional storage was a red herring.
Vincent
Vincent7mo ago
New update from today. Yesterday we tried to wrap the .fetch() calls we're making from our Worker to the Durable Object in a retry loop in an attempt to mitigate these intermittant failures. See the code snippet we're using. However, this isn't having any effect. We ran into another few of these internal error today, but retrying them will always fail in the identical way. Questions for @Frederik: 1. Should we also be obtaining a new stub instance if we want to do retries like this? 2. Should we add a delay before attempting these retries? Any guidance in this area would help. 3. Do you have an idea already about where these are coming from by any chance?
No description
Unknown User
Unknown User7mo ago
Message Not Public
Sign In & Join Server To View
Vincent
Vincent7mo ago
Thanks, @Frederik — that's good information, will try that next.
chronark
chronarkOP7mo ago
I replaced out code to create a new stub for the retries, but it didn't seem to help, we're still getting a lot of "internal error"
Vincent
Vincent7mo ago
@Frederik We just rolled out the change that grabs a new stub on every retry and that seem to have done the trick indeed—thanks! 🙏 This morning, we're still getting these "internal errors" but now whenever we get them, the retry will work (typically on the first retry), which makes this issue a lot less urgent for us. We'd still like to not get those "internal errors" of course, but at least it won't be causing issues for us anymore. I spoke a little too soon. Although in most cases retry works, we're still encountering some internal errors where retrying does not have an effect. So our issues aren't completely resolved yet.
Unknown User
Unknown User7mo ago
Message Not Public
Sign In & Join Server To View
Vincent
Vincent7mo ago
@Frederik Super appreciated! 🙏
chronark
chronarkOP7mo ago
we're also still getting errors about once every 5 min, even with retries with new stubs hey guys, could you share an update with us? it's still happening unfortunately @Frederik have you found a solution yet?
Want results from more Discord servers?
Add your server