❔ Optimizing EF query
I'm learning ASP.NET and C# and playing around with a "Spreadsheet" application. I have the following models:
The short of it is that a
Cell
belongs to a Row
and a Column
.
The Cell
also contains a string Value
and a method GetValueAsType
which looks up the Column.Type
and tries to parse the string into the correct value and returns it.
When I'm querying all my Rows
and Include
ing the Cells
, I get an Object reference error when I try to Cell.GetValueAsType
, I assume because the Column
isn't in memory to reference.
I could ThenInclude
the Column
when getting the Cells
, but it makes the query extremely inefficient because I'm fetching the same few columns from the database over and over for every cell.
Anyway here's my question:21 Replies
I found that if I just fetch the Spreadsheet's columns before performing my
Row
query, I do not get the object reference error. This seems like a hacky workaround. Is it?
I tried a bunch of ways to include the Columns in my Rows query but I wasn't sure how to get it to work properly (I kept converting the query in a Column or Cell query, or I'd lose my pagination based on rows (it'd paginate based on columns or cells) , etc)
That's where I'm at right now
So for example if I try:
It looks like I convert them and paginate Rows
correctly, but start getting the object reference error again
I know this is long winded, but I suppose to summarize: I do not want to get the Column
for every single Cell
. I want to just get the Columns
one time up front so that I can call the Cell.GetValueAsType()
which needs the Column
in memory to know what type the column isso, we're saying that
is inefficient?
I'm guessing it's doing N+1 querying for the columns?
have you verified this via logs, or are you just assuming?
like, your end goal here is that you want one EF query that will translate to
(order notwithstanding)
and then everything gets collated together in .NET
yes?
i would expect it to just join, the query is dead simple
my question is why does it sound like you're using a database to create a database?
if all cells in a column have the same type that's just a database table
uhh
that's not how JOIN works in EF Core
.Include()
does not translate to a JOIN
for one-to-many relationships
it translates to 2 queries
one for the parents, one for the children
and collates them in .NETi don't have logs handy but that doesn't sound like the queries i get for what i'm doing at work
could be thinking of something else, it's friday evening and my brain is done for the week
fair enough
lemme see if I can confirm
alright, let's see if I remember how to use the ef designer tools
damn, probably need to start with re-installing postgres
alright, so fixed is correct
at least for some scenarios
maybe it's a versioning difference too, I've been out of the loop on EF Core since 5.0
or maybe it depends on schema complexity
but for the following schema
this query
compiles to
and the inverse query
compiles to
so, bottom line
what is your issue @schwartzmj ?
Yes, the query is 350ms by doing
.Include(cell => cell.Column)
and ~16ms if I load the columns upfront and do not do the include
it's more like the user is creating their own columns of their own types. but yes i had the same thought about "using a database to create a database" 😆i mean, if that's the goal of the application you should ditch EF and use something like dapper because you don't have a well defined (as in, entirely user defined) model
I can certainly look into that. I'm very green to C# and of course ASP.NET
dapper is essentially raw SQL queries with some minimal mapping functionality
while EF gets upset easily if you don't follow its conventions
sounds like what you might want instead is a projection
if you wanna stick with EF
you don't need to pull the ENTIRE column entity, just the one Type value
for each cell
if your query is just compiling to one giant JOINed SQL query, then the difference in time is probably just all the duplicated data
limit that to just the Type column with a projection
the
type
is on the Column, so what i want to do is just grab the columns upfront and ignore them on the Cell level. i guess my current method of "get columns, ignore columns when getting rows/cells" works, it just seemed odd that it did workit's working because you're doing those all as tracking queries
so all those column entities are being tracked in memory
when a Cell comes in, EF recognizes that it has the Column for that in memory already and fills it in, even if you didn't explicitly request it on that query
ok cool. that's what i assumed and what i half-confirmed in my head reading some docs. just didnt want some bug or edge case to pop up. also was very curious on other ways of doing it
I'm not at my work computer right now, but later I plan on reading through these comments more in depth. I very much appreciate the help given so far
also was very curious on other ways of doing itprojected queries
have some of the docs up for that right now 🙂
unless you do
AsSplitQuery()
it will translate into joinswell, there you go
that could be an easy fix
force EF to do the multiple-queries approach, and eliminate redundant data in the transfer
do you mean the opposite of what you said?
cause testing seems to indicate the default behavior is to use JOINs
and the name
.AsSplitQuery()
would suggest to me that it's going to... split the query
docs seem to support this
or
okay, I see what you meandoes adding a comma after
AsSplitQuery()
clear up the message? Yes it will not do joins adding it inI misread that a bit
yeah
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.