Kord doesn't reconnect

I pretty much have issues since day 1 with my bot reconnecting, sometimes it just spams RetryLimitReachedEvent in my log I don't know whether this is just an issue with my logging, but the bot actually doesn't work until a full restart
141 Replies
SchlaubiBus
SchlaubiBusOP3y ago
Also in the process of connecting 14 shards it pretty much happens all the time that when the last shard got connected other shards have been disconnected again
gdude
gdude3y ago
I haven't noticed this with an unsharded bot so it might be related to that
LustigerLurch
LustigerLurch3y ago
Hm, can you call the GatewayBotGet endpoint to see the max_concurrency for your bot? Maybe it's because we don't properly handle it.
SchlaubiBus
SchlaubiBusOP3y ago
It's hard to experience that issue with an unsharded bot as it only connects one shard
{"url": "wss://gateway.discord.gg", "shards": 15, "session_start_limit": {"total": 1000, "remaining": 950, "reset_after": 63572627, "max_concurrency": 1}}
{"url": "wss://gateway.discord.gg", "shards": 15, "session_start_limit": {"total": 1000, "remaining": 950, "reset_after": 63572627, "max_concurrency": 1}}
I think we do connect shards in parallel
LustigerLurch
LustigerLurch3y ago
and you are only allowed to connect one concurrently
LustigerLurch
LustigerLurch3y ago
so it's this issue again, you already opened https://github.com/kordlib/kord/issues/625
GitHub
Properly implement login rate limiting · Issue #625 · kordlib/kord
Currently we connect all shards at once and rate limit the identify command, which causes session resets and timeouts (see #624) A better solution would be to implement rate limiting for logins
SchlaubiBus
SchlaubiBusOP3y ago
IG that's what that means yes yes Kord seems to eventually give up reconnecting I have an all shards ready event And it consumes all other ready events and if all shards on the current instance are ready it fires And it has logging and that logging tells me "waiting for [<all shards>] Kord doesn't log anything So it could be that which is broken But that shouldn't affect commands That event doesn't change some state
2022-09-19T20:28:01.550055599Z 2022-09-19 20:28:01.546 [DefaultDispatcher-worker-8] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 4 RetryLimitReachedEvent, Awaiting login from: [4, 12]
@LustigerLurch this also happens
LustigerLurch
LustigerLurch3y ago
does this log come from kord?
SchlaubiBus
SchlaubiBusOP3y ago
It doesn't It's my logging but that basically means a RetryLimitReachedEvent was fired And it never fires a ready event for that shard again
LustigerLurch
LustigerLurch3y ago
what does the trace logging for gateway events show?
SchlaubiBus
SchlaubiBusOP3y ago
gonna download .4 GB of LOGS real qiuck however since enabeling trace logging the issue hasn't occurred yet
SchlaubiBus
SchlaubiBusOP3y ago
Fleet doesn't want to open it http://rice.by.devs-from.asia/u/4869Mq.png
SchlaubiBus
SchlaubiBusOP3y ago
srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.302 [DefaultDispatcher-worker-6] TRACE dev.kord.gateway.DefaultGateway - Gateway >>> {"op":2,"d":{"token":"token","properties":{"os":"Linux","browser":"Kord","device":"Kord"},"compress":false,"large_threshold":250,"shard":[12,15],"presence":{"status":"dnd","afk":false,"game":{"name":"Starting ...","type":0}},"intents":"3243773"}} srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.521 [DefaultDispatcher-worker-22] TRACE dev.kord.gateway.DefaultGateway - Gateway <<< {"t":null,"s":null,"op":9,"d":false} srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.540 [DefaultDispatcher-worker-10] TRACE dev.kord.gateway.DefaultGateway - gateway connection closing srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.543 [DefaultDispatcher-worker-10] TRACE dev.kord.gateway.DefaultGateway - Gateway closed: 4900 reconnecting srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.544 [DefaultDispatcher-worker-4] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 12 SessionReset, Awaiting login from: [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14] srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.544 [DefaultDispatcher-worker-10] TRACE dev.kord.gateway.DefaultGateway - handled gateway connection closed srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.546 [DefaultDispatcher-worker-13] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 12 RetryLimitReachedEvent, Awaiting login from: [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14] srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.701 [DefaultDispatcher-worker-10] TRACE dev.kord.gateway.DefaultGateway - Gateway <<< {"t":null,"s":null,"op":10,"d":{"heartbeat_interval":41250,"_trace":["["gateway-prd-main-gh00",{"micros":0.0}]"]}} srv-captain--votebot.1.kyi5nfvreqag@v220210987031163663 | 2022-09-19 21:41:12.703 [DefaultDispatcher-worker-12] TRACE dev.kord.gateway.DefaultGateway - Gateway >>> {"op":1,"d":null}
Okay so op 9 is "The session has been invalidated. You should reconnect and identify/resume accordingly." That's what happens during startup
LustigerLurch
LustigerLurch3y ago
yeah, pretty sure it's because you are only allowed to login one shard at a time we should fix this!
SchlaubiBus
SchlaubiBusOP3y ago
yeah maybe this also causes the other disconnects because if one shard disconnects I get rate limmited
LustigerLurch
LustigerLurch3y ago
@I love Gradle files this is what I have so far as a replacement for the existing MasterGateway.startWithConfig:
public suspend fun startWithConfig(configuration: GatewayConfiguration, maxConcurrency: Int) {
require(maxConcurrency > 0) { "Invalid maxConcurrency: $maxConcurrency" }

// see https://discord.com/developers/docs/topics/gateway#sharding-max-concurrency
return coroutineScope {

var lastRateLimitKey = -1
val readyListeners = mutableListOf<Job>()

// sort gateways.entries to start shards in order
for ((shardId, gateway) in gateways.entries.sortedBy { it.key }) {
require(shardId >= 0) { "Negative shardId: $shardId" }

val rateLimitKey = shardId % maxConcurrency

if (rateLimitKey <= lastRateLimitKey) {
readyListeners.joinAll() // wait until all gateways from last bucket are started
readyListeners.clear()
}

// make sure we don't miss the event by executing until first suspension point before starting gateway
readyListeners += launch(start = UNDISPATCHED) { gateway.events.first { it is Ready } }

val config = configuration.copy(shard = configuration.shard.copy(index = shardId))
launch { gateway.start(config) }

lastRateLimitKey = rateLimitKey
}

readyListeners.clear()
}
}
public suspend fun startWithConfig(configuration: GatewayConfiguration, maxConcurrency: Int) {
require(maxConcurrency > 0) { "Invalid maxConcurrency: $maxConcurrency" }

// see https://discord.com/developers/docs/topics/gateway#sharding-max-concurrency
return coroutineScope {

var lastRateLimitKey = -1
val readyListeners = mutableListOf<Job>()

// sort gateways.entries to start shards in order
for ((shardId, gateway) in gateways.entries.sortedBy { it.key }) {
require(shardId >= 0) { "Negative shardId: $shardId" }

val rateLimitKey = shardId % maxConcurrency

if (rateLimitKey <= lastRateLimitKey) {
readyListeners.joinAll() // wait until all gateways from last bucket are started
readyListeners.clear()
}

// make sure we don't miss the event by executing until first suspension point before starting gateway
readyListeners += launch(start = UNDISPATCHED) { gateway.events.first { it is Ready } }

val config = configuration.copy(shard = configuration.shard.copy(index = shardId))
launch { gateway.start(config) }

lastRateLimitKey = rateLimitKey
}

readyListeners.clear()
}
}
SchlaubiBus
SchlaubiBusOP3y ago
I am on my phone rn But looks good at first glance
LustigerLurch
LustigerLurch3y ago
Could you try out with feature-login-rate-limiting-SNAPSHOT?
SchlaubiBus
SchlaubiBusOP3y ago
sure I did all of this in a hurry but I still have session resets
LustigerLurch
LustigerLurch3y ago
do you use kordex? it might have to be recompiled cause there was an inline functions somewhere in this pr
LustigerLurch
LustigerLurch3y ago
@MrPowerGamerBR are you familiar with sharding? If yes, could you take a look at this PR too? https://github.com/kordlib/kord/pull/693
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
this is the example from the docs, right? I've just pushed a test for the example and it passes (assuming I didn't write a wrong test :kek: )
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
ok but why would you say the code I wrote is incorrect for that example?
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
exactly but I'm still thinking if we should look at the actual bucket instead (see https://github.com/kordlib/kord/pull/693#discussion_r974822101)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
that's part of the Gateway (individual connections) code, this PR is about the MasterGateway
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
oh wait I think I know what you mean
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
yeah, rn we don't do this properly as it is rn, this is just about the inital startup
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
they are rate_limit_key 0 but bucket 0 and 1 that's what this is about
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
ah wait, they wouldn't, we wait when they key is <= the previous
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
so which one should we use shardId / maxConcurrency or shardId % maxConcurrency?
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
still don't understand why we would need a mutex there
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
so for the case a shardId is used twice? that's not possible gateways is a Map<Int, Gateway>
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
but in the impl I have rn, shard 16 wiht rateLimitKey 0 would wait too, because the rateLimitKey is <= the previous one
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
shard 15 has key 15, 16 has key 0 -> 0 <= 15 -> same thing
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
wait, I just noticed that we already have identifyRateLimiter in DefaultGateway
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
@I love Gradle files do you use any custom gateway logic? or does kordex do?
SchlaubiBus
SchlaubiBusOP3y ago
I don't
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
the thing is that individual reconnects also use the identifyRateLimiter
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
SchlaubiBus
SchlaubiBusOP3y ago
The issues might be related as I stated earlier
LustigerLurch
LustigerLurch3y ago
I'm wondering if I could reproduce your issues with a bot in 4 servers that I force to use 15 shards :D yeah, I can receiving session resets too that's good
SchlaubiBus
SchlaubiBusOP3y ago
That's actually good yes
LustigerLurch
LustigerLurch3y ago
ok I think I know what's wrong: the limit to the concurrent Identify requests per 5 seconds specified here https://discord.com/developers/docs/topics/gateway#rate-limiting actually mean the limit of concurrent connections opened that then send an Identify request oh, nevermind, still getting session resets I was right, just delayed in the wrong place hm, now I actually need to implement a new RateLimiter that gets consumed before starting a connection and released when the identify command was sent @I love Gradle files could you try again, I've pushed a change that fixes this and can no longer reproduce session resets with it
SchlaubiBus
SchlaubiBusOP3y ago
Tmr
LustigerLurch
LustigerLurch3y ago
sure 👍 @I love Gradle files did you have time to try it yet?
SchlaubiBus
SchlaubiBusOP3y ago
Nope Will notify you asap
LustigerLurch
LustigerLurch3y ago
After thinking about this some more, it might actually be better to do something similar to this and replace the identifyRateLimiter with it. This would then not only work for initial login but also reconnects that require sending a new identify. So essentially like your Microservice but with a coroutine boundary using e.g. channels. Nonetheless, the impl in the PR rn should work for the initial login
SchlaubiBus
SchlaubiBusOP3y ago
I have no Idea whether this uses the latest build though because I cannot control the CI dependency cache
2022-09-21T14:28:40.912679797Z 2022-09-21 14:28:40.912 [DefaultDispatcher-worker-29] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 0 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.914200194Z 2022-09-21 14:28:40.914 [DefaultDispatcher-worker-22] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 1 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.915295882Z 2022-09-21 14:28:40.915 [DefaultDispatcher-worker-10] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 2 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.916031643Z 2022-09-21 14:28:40.915 [DefaultDispatcher-worker-26] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 3 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.916503271Z 2022-09-21 14:28:40.916 [DefaultDispatcher-worker-25] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 4 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.917879113Z 2022-09-21 14:28:40.917 [DefaultDispatcher-worker-17] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 5 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.917889241Z 2022-09-21 14:28:40.917 [DefaultDispatcher-worker-29] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 7 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.917892347Z 2022-09-21 14:28:40.917 [DefaultDispatcher-worker-29] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 8 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.917895144Z 2022-09-21 14:28:40.917 [DefaultDispatcher-worker-14] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 6 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.917897839Z 2022-09-21 14:28:40.917 [DefaultDispatcher-worker-29] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 9 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.921890224Z 2022-09-21 14:28:40.921 [DefaultDispatcher-worker-25] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 10 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-21T14:28:40.921929910Z 2022-09-21 14:28:40.921 [DefaultDispatcher-worker-8] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 11 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Yeah this happens as well
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
SchlaubiBus
SchlaubiBusOP3y ago
It just disconnects all shards I could do that if nexus would not be a piece of shi
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
SchlaubiBus
SchlaubiBusOP3y ago
Nexus is so shitty it's unbelievable Just give me the directory listing but thanks It's always such a hassle to access these okay it seems to reconnect which is good @LustigerLurch I no receive ShardGotDisconnected with a ReconnectingEvent But shards seem to be connected
SchlaubiBus
SchlaubiBusOP3y ago
GitHub
mikbot/Bot.kt at main · DRSchlaubi/mikbot
A modular framework for building Discord bots in Kotlin using Kordex and Kord - mikbot/Bot.kt at main · DRSchlaubi/mikbot
LustigerLurch
LustigerLurch3y ago
did you also recompile kordex? cause the change depends on an inline function that kordex depends on :/ also what's ShardGotDisconnected? that's not from kord
SchlaubiBus
SchlaubiBusOP3y ago
GitHub
mikbot/Bot.kt at main · DRSchlaubi/mikbot
A modular framework for building Discord bots in Kotlin using Kordex and Kord - mikbot/Bot.kt at main · DRSchlaubi/mikbot
SchlaubiBus
SchlaubiBusOP3y ago
Actually not, but both of that events seem to get fired
LustigerLurch
LustigerLurch3y ago
yeah, but you need to have an recompiled kordex to see the effects of the changes but if you have no way to do this right now, I can also hardcode your max concurrency temporarily into the PR (in the non-inlined function that get's called) @I love Gradle files
SchlaubiBus
SchlaubiBusOP3y ago
recompiling kordex could be kinda difficult rn
LustigerLurch
LustigerLurch3y ago
yeah, then I can do this it's 1, right? pushed it @I love Gradle files, could you try once again, when the build is ready?
SchlaubiBus
SchlaubiBusOP3y ago
will do rn
SchlaubiBus
SchlaubiBusOP3y ago
Apart from not knowing where this comes from it looks good 2022-09-22T15:18:10.034921428Z 2022-09-22 15:18:10.034 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 0 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.038597081Z 2022-09-22 15:18:10.038 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 3 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.038958810Z 2022-09-22 15:18:10.038 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 4 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.039505892Z 2022-09-22 15:18:10.039 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 5 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.043609431Z 2022-09-22 15:18:10.043 [DefaultDispatcher-worker-7] DEBUG org.pf4j.AbstractExtensionFinder - Finding extensions of extension point 'dev.schlaubi.mikbot.core.gdpr.api.GDPRExtensionPoint' 2022-09-22T15:18:10.045478922Z 2022-09-22 15:18:10.045 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 6 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.045873364Z 2022-09-22 15:18:10.037 [DefaultDispatcher-worker-1] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 2 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.046375681Z 2022-09-22 15:18:10.039 [DefaultDispatcher-worker-32] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 1 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.047121782Z 2022-09-22 15:18:10.047 [DefaultDispatcher-worker-12] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 7 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.047558204Z 2022-09-22 15:18:10.047 [DefaultDispatcher-worker-18] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 8 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] 2022-09-22T15:18:10.047844849Z 2022-09-22 15:18:10.047 [DefaultDispatcher-worker-5] WARN dev.schlaubi.musicbot.core.Bot - Shard got disconnected 9 DetachEvent, Awaiting login from: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
LustigerLurch
LustigerLurch3y ago
DetachEvent is fine, it means that the gateway was detached aka finally stopped (Gateway.detach()), when do these events happen?
SchlaubiBus
SchlaubiBusOP3y ago
This should be since startup
SchlaubiBus
SchlaubiBusOP3y ago
They happen during startup, which is confusing me
LustigerLurch
LustigerLurch3y ago
Could it be that the previous version was still running? All shards are detached during a shutdown hook @I love Gradle files
if (enableShutdownHook) {
Runtime.getRuntime().addShutdownHook(thread(false) {
runBlocking {
gateway.detachAll()
}
})
}
if (enableShutdownHook) {
Runtime.getRuntime().addShutdownHook(thread(false) {
runBlocking {
gateway.detachAll()
}
})
}
so that actually two instances of your bot are writing to the logs
SchlaubiBus
SchlaubiBusOP3y ago
That log is a docker logs output
LustigerLurch
LustigerLurch3y ago
which means?
SchlaubiBus
SchlaubiBusOP3y ago
I am quite sure a docker container can't run twice
LustigerLurch
LustigerLurch3y ago
can you stop, wait a little and then restart the container? if you then don't see these events again, they are probably from the previous running container anyway
SchlaubiBus
SchlaubiBusOP3y ago
Caprover summarizes the log So that's probably the case I will see whether disconnect issues appear again Otherwise this looks great
LustigerLurch
LustigerLurch3y ago
if all shards disconnect and need to reidentify, the current code will still ratelimit, but I'm already working on a solution based on what I've got so far that can handle that situation too so it's only for inital startup so far
SchlaubiBus
SchlaubiBusOP3y ago
Oh I see
LustigerLurch
LustigerLurch3y ago
I've pushed that now
SchlaubiBus
SchlaubiBusOP3y ago
ok will try to test it this weekend ig
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
yes, the new identify rate limiter in #693 also does this: wait for rate limiter to give permission -> open ws connection and send identify -> wait for ready and notify rate limiter it can allow other shards to identify -> rate limiter waits 5s before giving next permission
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
also faced this issue, that's why :D
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
@MrPowerGamerBR what would be a max concurrency typically for large bots?
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
the reason I'm asking is that I now actually have something similar to this and was wondering if it should be using a dynamically sized map or a fixed size array (thread safety is no concern) also is this really max_concurrency from here https://discord.com/developers/docs/topics/gateway#session-start-limit-object 🤔
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
alright
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
@I love Gradle files I would say my fix for this is ready and should work for both initial start and reconnects.
SchlaubiBus
SchlaubiBusOP3y ago
Ok will try next week I think
LustigerLurch
LustigerLurch3y ago
amazing 👍 @MrPowerGamerBR do you think you could try the identify rate limiting from feature-login-rate-limiting-SNAPSHOT on your bot? This should do the job:
suspend fun main() = coroutineScope {
val kord = Kord("token")

launch {
val totalShards = kord.resources.shards.totalShards
var shardsConnected = 0
kord.events
.onEach { event ->
if (event is ReadyEvent) {
shardsConnected++
println("$shardsConnected / $totalShards shards connected")
} else {
println("Received $event") // e.g. SessionReset when rate limiting is wrong
}
}
.takeWhile { shardsConnected < totalShards }
.onCompletion { kord.shutdown() }
.collect()
}

kord.login { intents = Intents.none } // don't care about most events
}
suspend fun main() = coroutineScope {
val kord = Kord("token")

launch {
val totalShards = kord.resources.shards.totalShards
var shardsConnected = 0
kord.events
.onEach { event ->
if (event is ReadyEvent) {
shardsConnected++
println("$shardsConnected / $totalShards shards connected")
} else {
println("Received $event") // e.g. SessionReset when rate limiting is wrong
}
}
.takeWhile { shardsConnected < totalShards }
.onCompletion { kord.shutdown() }
.collect()
}

kord.login { intents = Intents.none } // don't care about most events
}
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
thanks :)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
seems like, yeah the thing is I have no idea if this is needed anymore (the commit that added this used ktor 1.5.2 and now we are on 2.1.2)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
even for Dispatchers.Default: shouldn't coroutines just ignore thread count? you can actually:
val kord = Kord(token) {
sharding { Shards(totalShards = 1344) }
}
val kord = Kord(token) {
sharding { Shards(totalShards = 1344) }
}
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
the dispatcher has a pool of threads, not coroutines, coroutines can be created arbitrarily
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
that would be a problem but ideally it shouldn't happen because everything uses suspension not blocking
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
hm, we log this about client thread count:
if (client.engine.config.threadsCount < shards.size + 1) {
logger.warn {
"""
kord's http client is currently using ${client.engine.config.threadsCount} threads,
which is less than the advised thread count of ${shards.size + 1} (number of shards + 1)
""".trimIndent()
}
}
if (client.engine.config.threadsCount < shards.size + 1) {
logger.warn {
"""
kord's http client is currently using ${client.engine.config.threadsCount} threads,
which is less than the advised thread count of ${shards.size + 1} (number of shards + 1)
""".trimIndent()
}
}
try this:
val kord = Kord(token) {
client = HttpClient(CIO) {
engine {
threadsCount = ... // maybe this helps
maxConnectionsCount = ... // or this
}
}
sharding { ... }
}
val kord = Kord(token) {
client = HttpClient(CIO) {
engine {
threadsCount = ... // maybe this helps
maxConnectionsCount = ... // or this
}
}
sharding { ... }
}
cause maxConnectionsCount is 1000 by default - which is not enough for your shards but still more than 100 oh, I know why 100:
HttpClient(CIO) {
engine {
maxConnectionsCount = ...
endpoint {
maxConnectionsPerRoute = ... // default 100 -> all gw connections are on the same route
}
}
}
HttpClient(CIO) {
engine {
maxConnectionsCount = ...
endpoint {
maxConnectionsPerRoute = ... // default 100 -> all gw connections are on the same route
}
}
}
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
should probably have a warning like this for maxConnectionsPerRoute @I love Gradle files do you remember anything about this: https://github.com/kordlib/kord/pull/198/files#diff-5ca725c196c4f33561f5f5e1485f518fe93f8e64c0c0deba6b947a8b5f5ceeb1 do you know if the fix is still needed?
SchlaubiBus
SchlaubiBusOP3y ago
I don't haha
LustigerLurch
LustigerLurch3y ago
thought so 😂 but it seems it is no longer needed, so should we just yeet it out? :kek: and you weren't limited by the maxConnectionsCount of 1000?
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
:)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
hm, what uses that much memory? there shouldn't be a ton of events (and cached entities)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
yay
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
yeah, we can't really do anything about this :pained_smile: thanks again for testing this :)
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
Sure, would be amazing if you could use this to test feature-login-rate-limiting-SNAPSHOT too @ToxicMushroom also how many shards does your bot have (cause if it's more than 100, the http client needs config changes) Could one of you give #693 a review as a sanity check, then it would be good to go @MrPowerGamerBR @I love Gradle files? You don't have to but it would be really nice :)
SchlaubiBus
SchlaubiBusOP3y ago
I will try at my hotel if mrpowergamer wasn't faster
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch3y ago
alright merged and in 0.8.x-SNAPSHOT
Unknown User
Unknown User3y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch2y ago
@MrPowerGamerBR do you remember whether you had to increase maxConnectionsCount and maxConnectionsPerRoute or was it just maxConnectionsPerRoute? i'm trying to implement this now so i want to know what should be taken into consideration for this warning
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
LustigerLurch
LustigerLurch2y ago
then we should check both, thanks anyway :)
Want results from more Discord servers?
Add your server