K
Kord2mo ago
DarkAtra

Kord does not recover from UnknownHostException

It seems like Kord doesn't recover from UnknownHostException - at least not fully as the Discord bot appears as offline and does no longer react to application commands until it is restarted. I ran into this issue two times in the past few weeks. The bot is using google dns to resolve hostnames. Stack trace is attached as screenshot since... discord message character limit... Also, here's the project in case you want to look something up: https://github.com/DarkAtra/v-rising-discord-bot
No description
20 Replies
SchlaubiBus
SchlaubiBus2mo ago
Would you like to it to retry after an UnknownHostException? I think it's intended, that it doesn't recover from name resulution issues, as the indicate a persistent problem
DarkAtra
DarkAtraOP4w ago
Yes, I would expect Kord to attempt to recover in such a case. In my experience, most UnknownHostException are temporary and usually recoverable by retrying with backoff. is there any way for me to force gateway reconnects when an UnknownHostException occurs for the time being? @SchlaubiBus any idea?
SchlaubiBus
SchlaubiBus4w ago
I don't think there is a way
DarkAtra
DarkAtraOP3w ago
The same issue just happened again today at around 1am. My bot was unable to resolve the hostname gateway-us-east1-b.discord.gg for about 2 minutes (using google DNS). I thought about switching the dns provider that ok-http uses from system (google dns) to something else. However, this could still fail since there is no guarantee that DNS lookups always succeed. I think the only way of dealing with this issue permantently is to make kord more resilient and attempt reconnects with backoff. Right now it's getting stuck in a weird state where it can query and update messages in a discord server but doesn't react to application commands anymore.
SchlaubiBus
SchlaubiBus3w ago
When does this happen, because it wouldn't make sense for this to happen during a connection
DarkAtra
DarkAtraOP3w ago
The cause likely is:
2025-04-09T01:13:50.030Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.SocketException: Socket closed
at [email protected]/sun.nio.ch.NioSocketImpl.endRead(NioSocketImpl.java:243) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796) ~[na:na]
at [email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:489) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:483) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1461) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1066) ~[na:na]
at okio.InputStreamSource.read(JvmOkio.kt:93) ~[na:na]
at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:153) ~[na:na]
2025-04-09T01:13:50.030Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.SocketException: Socket closed
at [email protected]/sun.nio.ch.NioSocketImpl.endRead(NioSocketImpl.java:243) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346) ~[na:na]
at [email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796) ~[na:na]
at [email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:489) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:483) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1461) ~[na:na]
at [email protected]/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1066) ~[na:na]
at okio.InputStreamSource.read(JvmOkio.kt:93) ~[na:na]
at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:153) ~[na:na]
SchlaubiBus
SchlaubiBus3w ago
I men that's just internal code of okhttp
DarkAtra
DarkAtraOP3w ago
discord message length fucked up the stack trace - i've updated it.
SchlaubiBus
SchlaubiBus3w ago
it's still just okhttp
DarkAtra
DarkAtraOP3w ago
this is the disconnect from the gateway and then kord attempts to reconnect right after - this is where the hostname exception occurs
SchlaubiBus
SchlaubiBus3w ago
I don't see that in that error
DarkAtra
DarkAtraOP3w ago
correct me if i'm wrong but this is how i understand the gateway connection: There is a websocket connection that kord uses to receive events from discord - for example for command issued by users on a server. When this websocket connection is closed by the peer for whatever reason, kord attempts to reconnect to the gateway. Now if the hostname of the gateway could not be resolved - the attempt is aborted and kord never reconnects. i cant post the full stack trace as i dont have nitro. the logs reads:
2025-04-09T01:13:50.030Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.SocketException: Socket closed
...
...
2025-04-09T01:13:58.060Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.UnknownHostException: gateway-us-east1-b.discord.gg: Temporary failure in name resolution
...
...
2025-04-09T01:13:50.030Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.SocketException: Socket closed
...
...
2025-04-09T01:13:58.060Z ERROR 1 --- [atcher-worker-3] dev.kord.gateway.DefaultGateway :
java.net.UnknownHostException: gateway-us-east1-b.discord.gg: Temporary failure in name resolution
...
...
SchlaubiBus
SchlaubiBus3w ago
You can upload them as a file?
SchlaubiBus
SchlaubiBus3w ago
This is the reconnect code
No description
DarkAtra
DarkAtraOP3w ago
DarkAtra
DarkAtraOP3w ago
the errors are 8 minutes apart tho.. so yeah not sure if they are related
SchlaubiBus
SchlaubiBus3w ago
There should be more logs about retrying
DarkAtra
DarkAtraOP3w ago
nope, nothing else besides the two errors i just sent you. The UnknownHostException reoccurs like 20 times or so tho
SchlaubiBus
SchlaubiBus3w ago
Then you need to increase the log level
DarkAtra
DarkAtraOP3w ago
ok, i'll set it to trace for kord and report back when it happens again

Did you find this page helpful?