since 2 days ago I'm getting a ton of
since 2 days ago I'm getting a ton of
internal error
and Network connection lost.
errors when calling my durable objects. they don't have particularly high load and it happens to many different instances, not just one
there's no useful error to help me fix it unfortunately14 Replies
Unknown User•9mo ago
Message Not Public
Sign In & Join Server To View
✅ Done!
another user reporting a similar issue with D1: https://discord.com/channels/595317990191398933/992060581832032316/1230510801522131035
@Frederik Just sent you all the details. For us, it's mostly the
internal error
, not the Network connection lost
thing.sure, 1s
@Frederik Small update here: we investigated a bit more today, and still have no good sense of where those
internal error
errors are coming from, but… we have noticed that they all don't carry the remote: true
field, meaning it's unlikely that they're actually coming from transactional storage APIs—contrary to the initial hunch we had before. If so, they would have been thrown by the Durable Object, in which case we would see these errors having a remote: true
property when caught in the worker. So it's more likely that they have a different origin, and maybe the focus on transactional storage was a red herring.New update from today. Yesterday we tried to wrap the
.fetch()
calls we're making from our Worker to the Durable Object in a retry loop in an attempt to mitigate these intermittant failures. See the code snippet we're using.
However, this isn't having any effect. We ran into another few of these internal error
today, but retrying them will always fail in the identical way.
Questions for @Frederik:
1. Should we also be obtaining a new stub instance if we want to do retries like this?
2. Should we add a delay before attempting these retries? Any guidance in this area would help.
3. Do you have an idea already about where these are coming from by any chance?Unknown User•9mo ago
Message Not Public
Sign In & Join Server To View
Thanks, @Frederik — that's good information, will try that next.
I replaced out code to create a new stub for the retries, but it didn't seem to help, we're still getting a lot of "internal error"
@Frederik We just rolled out the change that grabs a new stub on every retry and that seem to have done the trick indeed—thanks! 🙏 This morning, we're still getting these "internal errors" but now whenever we get them, the retry will work (typically on the first retry), which makes this issue a lot less urgent for us. We'd still like to not get those "internal errors" of course, but at least it won't be causing issues for us anymore.
I spoke a little too soon. Although in most cases retry works, we're still encountering some internal errors where retrying does not have an effect. So our issues aren't completely resolved yet.
Unknown User•9mo ago
Message Not Public
Sign In & Join Server To View
@Frederik Super appreciated! 🙏
we're also still getting errors about once every 5 min, even with retries with new stubs
hey guys, could you share an update with us?
it's still happening unfortunately
@Frederik have you found a solution yet?