Questions about Discord.js cache
Hello.
I had sent a friend to write a question about Discord.js as I'm very busy.
He has been banned, and I don't understand why.
So I'm going to ask the same question again, but in my own words.
Some developers I know have complained about the Discord.js cache. Personally, I haven't had any problems with it yet, but I'm a bit apprehensive. One of the things I've heard is that it takes up a lot of RAM as a bot scales.
One dev that I know also complained a bit about shelters, saying that they can be a bit tricky.
But, still, I'm not particularly shocked by all this.
The question I'm wondering about is whether the cache ends up with duplicate entries or not, or whether it would simply be possible to limit the amount of information cached in-memory and, once a certain threshold has been reached, send it to a Redis cache?
This kind of optimisation could be implemented almost for-free. I'm not asking for it to be done tomorrow, or even for it to be done at all.
I'd simply sent my friend to ask you whether you knew Bentocache (made by one of the AdonisJS core team member), and what you thought of it in relation to your considerations about Discord.js's cache?
Historically, it doesn't seem to me that there's really any other lib that exists to be able to drive different caching technologies like Bentocache does. In fact, that's what its author told me about his motivation for writing it and making it open source.
I completely understand why Discord.js has its own in-house caching solution, given that pretty much the whole JS ecosystem is playing the same game, which is precisely what Bentocache is trying to solve.
I would therefore like to know, in all transparency, if you are having problems with duplicate data in your cache. For example: how you go about sharing cache information between shards (which are separate processes and cannot therefore share in-memory context with each other except via a distributed cache such as Redis or DragonflyDB?).
7 Replies
- What's your exact discord.js
npm list discord.js
and node node -v
version?
- Not a discord.js issue? Check out #other-js-ts.
- Consider reading #how-to-get-help to improve your question!
- Explain what exactly your issue is.
- Post the full error stack trace, not just the top part!
- Show your code!
- Issue solved? Press the button!---
Finally: is the RAM consumption of the Discord.js's cache objectively "Excessive", with potential duplicates, or does it only correspond to what the bot actually consumes? Is the potential overhead minimal in practice, or really annoying?
I've just refactored my codebase a bit to avoid abusing a global client getter, given that this doesn't seem to me to be the right approach at all for keeping a healthy codebase ready for sharding.
(Except when used to get "Universal" metadatas, like the Discord Bot user's ID, which I guess wouldn't change in the shards?)
When I go back over my code, I see that there are cases where I could clearly enjoy if the cache is distributed. And more pertinent imo than just complaining about RAM (a point that I personally don't care about, have 1TB of it to waste. Even if I'm kinda "Green IT", though, and may have some concerns to do the best job possible according to this philosophy.)
For example: trying to find out if my bot is present on a guild by hitting in the distributed cache of all the shards first? Checking if any shard has a
ModerateMember
permission on a guild by hitting the same distributed cache first?
Still hoping an answer btw.Interesting, thank you!
I haven't delved any deeper into the subject at the moment. I'm gathering some feedback from people around me but I haven't looked at the specs in detail because I'm not at that stage in the development of my project yet.
For this concern, I'd just like to avoid multiplying code smells and anything else that might make it difficult for me to make the transition.
What about what I've heard about RAM consumption? Is that an exaggeration?
Hmm...
I was told that the Discord cache was taking up several gigabytes of RAM. But I get the impression that some people want to stay on $2/month servers for weird reasons...
So, I'm wondering if it's not a bit excessive to insist on running a Discord bot in less than 4 GB of RAM, even if it scales.
And on the other hand, I wonder where that "Limit" is?
Is it really unpleasant to run a big Discord bot on a $2 server, but as soon as you switch to a dedicated server, it's not even a debate anymore?
I'm very cautious in this respect. In everything I code, I only use what I need.
I only expose what I use elsewhere, and so on.
I believe much more in an approach where you 'Expose'/'Include' as and when you need to, rather than having everything already imported from the start and have a huge technical debt "Just in case"...
---
I've also heard that the cache can be so stale that it says a member is on a guild when they're not. And other examples where the cache is... "Prickly".
Does this really happen in practice, or is it bullshit in the end too?
Or did it actually happen in an older version of Discord.js but has since been fixed?
Doesn't sound alarming to me
That's beginning to reassure me, thank you
And what about duplicate data in the cache, does that exist? And if so, is the overhead minimal?
Also, do you think it's preferable to segregate the data in each cache of each shard so that they all have the most stateful and isolated context possible, and is this a conscious decision?
Or was it also made out of spite for having a lib that offered an adequate abstraction so that we could get away from the implementation details and offer a more 'full-features' architecture?
(Maybe I also absolutely understood nothing about the implementation of the cache of Discord.js on these points, correct me if I'm wrong.)
Thanks for helping me find out, it doesn't look so bad...
You can have a user cached in several shards independently though if they share more than one guildYes, that detail is perhaps frustrating a bit! We'll maybe talk about that later. --- Basically, my current problem is that I'm in the process of creating "Jobs". These jobs are background tasks which, for example, allow temporary bans to be lifted (created via a command to set a temp-ban with a duration of 80 days, for example). The thing is, to optimise things a bit, when I go to a guild to start triggering tons of unbans (it's throttled, still...), if I get even the slightest negative response from the Discord API, I inspect it, and if it's because of missing perms or because the bot is no longer present on the guild, it'll move ALL the tempo ban entries in the DB elsewhere so that the jobs don't pull them anymore. I'm worried about coordinating the shards properly for these jobs. I'm afraid I'm setting off time bombs as I go along. A background job is global at the moment and does a call to
getClient
, which retrieves the global client.
I'm a bit worried about what's coming next with this way to implement it in my codebase.
Yep, I try to use interaction.client
and things like this the most I can
But in the case of a task that needs to be repeated every minute, every hour, or every second to check that something in my database is still valid... I'm wondering how this will work with sharding. Because it's kinda "Decorrelated" to mechanisms like interaction.client
which I guess scope the expected shard by-design.
Because if one shard tells me "no" while another could have said "yes", and I rely on that to mess up everything in my database, I will be very sad. 😦
For example:
In the case of checking if the bot is still present in a guild where I have a temporary ban entry to manage in my database, I would like to ensure that all the shards say "No" before moving it to an another table.
And, on the other hand, ensure that at least one shard says "yes" (and identify which one) to properly process the unban if possible.
... 😅
Oh sorry, you're right.
I’m not there yet, but because what I’ve heard about "Excessive RAM usage", "Prickly stale data in the cache", and the idea that "Sharding is too magic"… I started to accumulate stress and felt the need to discuss them.
I think I can understand the Discord.js features properly, if I focus.
My goal isn’t necessarily to make the fastest bot in the world; I mainly want to ensure that it is stable.
Some people suggested that I should implement pub/sub myself and similar things...
The problem with these kinds of approaches is that you might end up potentially working "against" Discord.js (or any other lib), and creating a pure mess.
I see. Thank you.
I think I’m a bit less worried about the moment when sharding will need to be implemented.
The three calls to a global client that I see in my current codebase are:
- 2 related to guildId
/fetching a guild (that seems okay, it shouldn't be too bad)
- 1 related to client.channels.cache.get(channelId) ?? client.channels.fetch(channelId)
(maybe a bit less convenient?)
(I’ve done quite a bit of refactoring, and now there are only these three cases left that I see in the code. I’m glad it’s so few, actually...)
Oh, .fetch
is lazy by default?
What is a "Manager"? Is that a kind of singleton?
because you obviously already have the structure so know you have it alreadyIs it always that obvious to know what is properly cached in a structure when you start applying sweepers, by the way? I haven’t explored this point yet. EDIT: Oops, I don't think "shelting" is quite the right word... I recall someone mentioning something that allows clearing caches at certain intervals... I'm trying to find that. Sweepers*, sorry!
discord.js Guide
Imagine a guide... that explores the many possibilities for your discord.js bot.
(Basically, someone quickly mentioned this to me, saying I should care about sweeping or else the cache would start eating up my RAM. Since then, it's become a bit of a rabbit hole for me... I'm questioning a lot and would like to understand what's really relevant and in which context.)
I also want to avoid fetching the API to keep me the most away from rate limits. :/
And also because everything is way faster from an in-memory cache.
Like the difference between the types
Client<false>
and Client<true>
, somehow?
Yes, but then if you have Client<false>
, client?.user
is typed as Nullable.
But if you have Client<true>
, you can just type client.user
without any type error.
I'm trying to understand
I'm not sure to fully understand the difference between "Structure" and "Manager" yet. I even don't really know what a "Structure" is in the Discord.js semantic. :x
Yes and this is the first thing I think about when you're talking to me about some "Obvious" things, it's the first thing I know in Discord.js and makes sense to me, so that I try to make an analogy
Hmmm... So I might be prudent when I got a structure like newMember
in a GuildUserUpdate
event, for example? It would be a structure?
Well... nevermind, I'll investigate a bit myself on those concerns. Thank you a lot for all those answers!
Hehe, I don't want to take all your time ^^'
Buuut...
Yes that's it, okay
if called on a manager yes. if called on a structure it will always call the APIConcerning this point, what are two possible scenarios? One scenario of "called on a manager", and one scenario "called on a structure". Just to start to understand. :/ Oh okay It seems it's regarding the abstraction level A "Structure" is something I manipulate which would be like more "Granular"? Like one child of a relatively "big" parent? :<
client.guilds.cache.get(id)
And that would be lazy by default, then?
But if, on a structure, you do newMember.fetch(...)
... well, this doesn't make sense to me? I misunderstood again.
Maybe I should take a rest.
Oh, yup
I use newMember
as my main source of truth in this event because it is NOT a partial, actually
whereas oldMember
is typed as _: PartialGuildMember | GuildMember
, and seemed suspicious to me
Hmmmm, but I guess they are still benefit to use partials?
Even if I didn't find out in my use cases yet... I thought it was required by Discord.js too for some internal magics...
hmm
I put it without thinking about what it actually does.
[Partials.GuildMember, Partials.Message, Partials.Channel, Partials.User];
Oh, Partials are there to "Top up" the cache with new informations?
So, if I want more "Reactivity", I have to use Partials and deal with it, otherwise I could miss some events?
This second option seems to lead to a lot of undecidable challenges
Some simple examples, but for instance:
- I absolutely want to know if my bot has its permissions changed on the server, so that if it loses or regains ban permissions, it updates my database entries correctly to avoid cluttering jobs and mitigate that as best as I can.
- If the bot gets kicked, I definitely want the guild to be marked as "Abandoned" until its return, although this will require double-checks before eventually deleting all their data after a given time.
I use these examples because here we're talking about trying to improve the bot's safety as much as possible, so I think it's appropriate to want to react swiftly in these cases.
If I want the bot to be able to ping a new member who has just arrived in a guild, I guess partials would be appropriate as well?
hm
If I remove all the partials
and test a lot my bot (in real-world scenarios, not mocking the entire Discord API, obviously)
would I smell like there's a missing one by experience?
Or could I be fooled?
Is it also what can induce an event to be triggered very "Lately"?
Okay, never happened to me but heard some complaints about it too
I'm a very unlucky person. Often, I'm not affected by what people complain about, but I end up complaining about things that people never complain about...
I like to have my own opinion about almost everything, and need to experiment by myself before thinking like the others... But when I hear strange rumors, it still makes me uneasy even though many times, from experience, I've realized it's just because complainers (often) do pure messes.
I don't know. I hear things that I'm told to be very careful about and that need to be handled in a certain way, when in reality it ends in a "double-edged sword" solution... Whereas, if other things had been done correctly beforehand, there would just be no "sword" at all... That sort of things.
I'm still very stressed by all this noise. Thanks again for these answers. I took notes and will reread them periodically.
I just feel like I'm not really in the same world sometimes. Anyway, that's another topic.
🤣
Thank you!
I'm not sure what time it is where you are... So I wish you a good day/evening. 😄
Thank you very much, I'm going to rest a bit.