C#•2y ago

❔ EFCore Sqlite Async: RAM and CPU Problem

So my code does the following: 30 Targets are being processed at once via async operations. Each second for each target I create 20 new Children of the target. So per second I have 20 * 30 new children. Each 20 children I call the "SaveAsync" method. However before I add a child to the database I want to ensure that the child does not exist yet, if it exists already I update its properties.

var existingUser = await ctx
                        .Children
                        .FirstOrDefaultAsync(x => x.RestID.Equals(scrapedUser.RestID));

var existingUser = await ctx
                        .Children
                        .FirstOrDefaultAsync(x => x.RestID.Equals(scrapedUser.RestID));

Im doing it with this code. I am assuming that EFCore loads all the children objects into memory, that leads to very high memory exposure, how can I prevent this? It should operate on Database level each time and not load all objects in RAM. It is important because I have millions on child object. Second issue is the CPU issue. Without doing the database actions, such as SaveAsync, FirstOrDefaultAsync the CPU is around 1% (normal for my Task), however with those methods after around 2mins I have a CPU over 90%.

249 Replies

Angius•2y ago

No, EF does not load the whole database into memory It generates an SQL query with a WHERE and LIMIT 1 in this case

KroksOP•2y ago

Hmmm whats the issue with the memory leaks then? Without DB usage I have consistently around 400 and with it just goes up and up I can send all operations that I do on Database side Well basically I only do these two that I just said

UltraWelfare•2y ago

Why not try profiling it to see which objects are doing the damage

Angius•2y ago

Huh, well, the only way I can see it being an issue is if maybe EF cannot translate that .Equals() You should just take a look at what query gets generated

KroksOP•2y ago

   var existingUser = await ctx
                        .Children
                        .FirstOrDefaultAsync(x => x.RestID.Equals(scrapedUser.RestID));

                    if (existingUser != null)
                    {
                        existingUser.ScrapedDate = scrapedUser.ScrapedDate;
                        existingUser.Followers = scrapedUser.Followers;
                        existingUser.CanDM = scrapedUser.CanDM;
                        existingUser.StatusesCount = scrapedUser.StatusesCount;
                    }
                    else
                    {
                        ctx.Children.Add(scrapedUser);
                    }

   var existingUser = await ctx
                        .Children
                        .FirstOrDefaultAsync(x => x.RestID.Equals(scrapedUser.RestID));

                    if (existingUser != null)
                    {
                        existingUser.ScrapedDate = scrapedUser.ScrapedDate;
                        existingUser.Followers = scrapedUser.Followers;
                        existingUser.CanDM = scrapedUser.CanDM;
                        existingUser.StatusesCount = scrapedUser.StatusesCount;
                    }
                    else
                    {
                        ctx.Children.Add(scrapedUser);
                    }

so this code looks fine at first glance?

Angius•2y ago

Yep

KroksOP•2y ago

so the DBSet "Children" is not loaded into memory?

Angius•2y ago

Could probably use ExecuteUpdateAsync() instead, but this is valid Nope, it should not be loaded unless you explicitly load it all

KroksOP•2y ago

I only do that + SaveChangesAsync after each 20 new children that I add to the DBSet nothing else so whats my best approach to solve this ?

Angius•2y ago

Well, at a glance, there's... nothing to solve, everything seems fine Unless you're doing something really stupid like newing up a new DbContext with every loop

KroksOP•2y ago

I dont actually, I have 30 DbContexts as otherwise I get an exception for concurrency. Per async Task that generates new childs per target one BotContext

Angius•2y ago

How do you get those contexts?

KroksOP•2y ago

using var ctx = new BotContext(_db);

using var ctx = new BotContext(_db);

or what do you mean?

Angius•2y ago

Yeah, that seems fine-ish

KroksOP•2y ago

_db is path to sqlite

Angius•2y ago

I thought you might be not disposing of them, hence the memory issues Huh

KroksOP•2y ago

I mean they stay alive, these tasks that I was speaking of do not exit for hours so this instance stays alive for hours 30 instances

Angius•2y ago

I'm out of ideas, I'm afraid Try asking in #database maybe? Link this thread there, provide a short summary

KroksOP•2y ago

Ok thanks for trying to help.

Insire•2y ago

if you dont dispose the dbcontext then each added Child/User will continue to be tracked. meaning memory usage will increase same for the updated user, since you seem to be tracking entities by default so its possible, with your current approach, to load the entire database into memory multiple times (as many times as you have semi persistent dbcontext instances) the solution is to dispose the dbcontext earlier

KroksOP•2y ago

makes sense I will defo try that what about the CPU usage? Might it be because I save too frequently? 30 Tasks save it each 20 new children so basically each second it saves 30 times or isnt that a problem

Insire•2y ago

saving and writing to database in itself isnt cpu intentive, its IO intensive but your queries can become expensive, given enough data to process e.g. by having no or outdated indezes

KroksOP•2y ago

but then I guess I have the issue that I create them too frequently. Like I would have a new context per task each second, what is like suggested?

Insire•2y ago

dbcontext is pretty cheap to create

UltraWelfare•2y ago

Or a lot of indexes...

KroksOP•2y ago

so its fine if I have it in a loop but dispose after each iteration

Insire•2y ago

it kinda depends, this seems to enter the realm of mass data processing, which is fine for ef core, it becomes troublesome, when you have to update stuff frequently consider using executeupdate or switch to something else like linq2db, that supports merge statements

KroksOP•2y ago

the updates are very very rare its just about the amount of data that is a lot i will try now with the new approach for the context to see if it fixed both issues

Insire•2y ago

my personal threshold is 1k tracked entities per dbcontext, i will try to use a new one, when i surpass that limit

KroksOP•2y ago

ok got it

Insire•2y ago

because 1k-2k entities can be processed by ef core with ms sql server in a single savechanges call in a reasonable timeframe its less for sqlite iirc this is not a limitation by ef core, but the dbdriver only supporting a limited amount of query parameters per query

KroksOP•2y ago

sadly didnt solve it like CPU gets to 40% after 5mins and ram still explodes

Insire•2y ago

its probably time to post some actual code, not just the query bit

KroksOP•2y ago

is vc possible too or no? because i cant actually share the code

Insire•2y ago

sadly not

KroksOP•2y ago

This is the critical code that does DB operations

message.txt

KroksOP•2y ago

in parameterless ScrapeAsync method there is max 30 tasks happening

Insire•2y ago

the dbcontext isnt being disposed there

KroksOP•2y ago

in the method I mentioned?

Insire•2y ago

the code, you posted, yes

KroksOP•2y ago

the one in parameterless scrapeasync needs to be disposed after getting the targets from it? I thought its not as critical because from that context I just get the targets, nothing else. Its not used for anything else

Insire•2y ago

change tracking works for every entity unless you disable it change tracking is whats causing your increasing memory

KroksOP•2y ago

how do i disable it? also is this better?

   var ctx = new BotContext(_db);

                await ctx.Database.EnsureCreatedAsync();
                var targets = await ctx.ScrapeTargets
                    .OrderBy(st => st.LastScraped)
                    .ToListAsync();

                await ctx.DisposeAsync();

   var ctx = new BotContext(_db);

                await ctx.Database.EnsureCreatedAsync();
                var targets = await ctx.ScrapeTargets
                    .OrderBy(st => st.LastScraped)
                    .ToListAsync();

                await ctx.DisposeAsync();

not sure if you meant that

Insire•2y ago

you need change tracking for updates, inserts and deletes so you cant disable it

KroksOP•2y ago

bruh

Insire•2y ago

but you can reset the changetracker by creating a new dbcontext or calling a method on the changetracker itself, but thats slower in my experience than just creating a new dbcontext

KroksOP•2y ago

am i not doing it properly right now? creating new db context each time

Insire•2y ago

yes, i dont know what you changed, but nothing relevant seems to have changed in your code, after i explained your problem and why you have it

KroksOP•2y ago

I moved the using in scrapeasync with parameters down into the loop so it does not stay for long after each iteration its being disposed now or im wrong?

Insire•2y ago

you havent shared code, so i cant comment

KroksOP•2y ago

I have here? or did I miss something

Gaming

Programming

❔ EFCore Sqlite Async: RAM and CPU Problem

Did you find this page helpful?