MongoDB & mongoose | How to automatically remove element from array after certain amount of time?
I'm creating Instagram clone and right now working with stories. I need them to be visible to followers for only 24h, so I was thinking of giving each user
activeStories
field which would be array of story IDs that user posted within last 24h. But I'm not sure how I would remove them after 24h. Is there a built-in way to handle this or am I gonna have to handle that myself?11 Replies
As much as I can see it's only for deleting whole document after given time. So it would delete whole story and I just want to remove it from user's
activeStories
field. Because I am still going to use that story after 24h (like letting user see all his stories in history, adding stories to highlights etc.).
However maybe I could make stories disappear after 24h, but save them in separate collection for longer use? Maybe something like "storyHistory" or "allStories" or something like that... And just give each document authorId
which would be author of those stories? So I'd have something like:
And for highlights I was already planning on making separate collection. Not sure how effective this would be and if there's maybe a better way?
OR I could keep stories in stories
collection, but make separate collection for stories that are posted within last 24h? So I'd have collection named something like activeStories
and there I'd save story as soon as it's created, and delete it automatically after 24h with TTL?hmm.. it does sound lika a viable option to me
u can try
maybe if u stumble upon a better solution u can update later on?
one small detail here, that deletes stories in the mongo db which is, as you said, ids of stories, but what happens to the actual stories? are they kept in storage or what
there's 2 approaches I can think of that I would use in a real app...if the stories are to be deleted from storage I would set up a CRON job that automatically deletes all stories older than 24h AND the references to them in the database
the 2nd approach is based on what I see instagram does, which is:
- The story is no longer available to be viewed after 24h
- The story can be retrieved and saved
meaning that you can't see the stories but they AREN'T deleted, the author could still see them and upload them again after some time, in that case since we're not deleting any storage all you really need to do is set the date a story was published, then, when you're retrieving someone's stories you filter out all stories older than 24h
Plan was to have two collections:
1.
activeStories
- fields would be something like authorId
and storyId
but it would be automatically deleted after 24h.
2. stories
which would never be automatically deleted.
So when user created a story it would also create activeStory
. And when I need stories posted within last 24h I'd get them from activeStories
collection, otherwise I'd use stories
.
However that was alternative option. My main plan was to have activeStories
field in stories
collection. But I'm not sure how I would automatically remove expired stories from that field.
I was thinking of this as well, but I believe it would be more efficient to have all active stories in one place, so I don't have to go through ALL stories and filter the ones posted by one author within last 24h. Instead I'd just go through User.activeStories
generally, I would not recommend doing things like that, when you have fields that are updated after certain conditions it is called a "cascade", the problem with cascades is that it gets ugly pretty fast, imagine 10 stories are uploaded at 10:30 and 24h later 10 stories are deleted, no big deal right? but now imagine a million stories were uploaded at the same time, in 24h a million updates will be made to your database, not so good now.
Of course at your scale you needn't worry about this, but something to keep in mind
now, remember that databases, even ones like mongo have INDEXES, which make lookups much faster than what you could do with javascript array.filter() or anything like that, YOU aren't going to filter anything, the database will
Yeah... haven't thought of that haha. And I don't pay much attention to scale of my app, meaning I'm trying to make them as efficient as possible and I'm always thinking how this would work if I had millions of users/stories... Since I'm not doing it JUST for fun, someone else is going to look at my code as well (employers etc...).
Which is why I'm wondering if your way would still be good approach? Like if it had a lot of users and stories?
I don't know much of databases so not sure how indexes work? Is it not going to loop through all documents or?
cascades have their uses, is a bit complex for a subject tbh but the main issue here is added complexity when really you can just filter old stories out
databases index columns (or in mongodb "fields") with a data structure called a b-tree, b-trees are SUPER efficient at look ups, depending on your settings and what not you could find a specific piece of data among 10 million in like 5 lookups
it is not a loop, a loop would make lookups O(n) which is linear time but b-trees take O(log(n)) time
I see, I'm gonna look more into that. Thank you for helping :)
B-tree
In this tutorial, you will learn what a B-tree is. Also, you will find working examples of search operation on a B-tree in C, C++, Java and Python.
anyway, TL;DR indexes are fast you are not going to be slowed down by the method I talked about
of course you then have to read about how to add indexes to mongo 👍 (it's not hard)