❔ Is this usage of Tasks equivalent to Threads?
I have implemented this using Tasks.
My goal is to load the sessions in parallel since they are not dependent on each other. I could have used threads, but I came up with this solution instead. See how I do not await
CreateSessionAsync
, I await all tasks at the end. I could have also used Threads to split the session loading.
Question is now, is there a major difference?65 Replies
Or does this all take place in one thread?
logically, they are equivalent, yes
you can think of
async
/await
code as executing on a logical thread
it behaves, logically, like a thread
one statement follows the next, sequentially, and the caller doesn't continue until you return
the difference is that your logical thread is actually going to be executing piecemeal, upon any number of physical threads
await
ing is the equivalent to blocking, on a logical thread
the logic can't continue until the Task
that you're await
ing completes
but unlike physical blocking, await
does not block any physical threads
it takes the current "state" of the method, bundles it up in a box, and dumps it on the heap, so it can be retrieved later
ot "suspends" the method so it can later be "resumed"
that way, the physical thread that was being used to execute that method can now be freed up to do something else
that's the other magic to realize
async
/await
only operates under the umbrella of a SynchronizationContext
, which can be thought of as a type of scheduler
when an await
ed Task
completes, it grabs the state of the suspended method off the heap and sends it to yhe SynchronizationContext
to be resumed
the context will pick an available thread to execute it on
or whateverYeah I think I get the concept pretty much
Thanks for the write-up
it'll schedule the method to start back up again based on whatever criteria it wantsnand whatever resources it has access to
all that to say
the advantage of
async
/await
is to give you all the logical benefits of creating and executing many threads at once, witbout the large overhead that physical threads have
rather, without the large overhead of frequently creating and destroying physical threads
this can get you both logical and actual parallelism
depending on the context
your approach of using Task.WhenAll()
looks spot-on
although it's quite tough to read on mobile
when you call an async
method, that method executes synchronously, on your thread, until it hits an await
at that point it suspends and returns a Task
for completion, and you move on to your next method call
in a logical sense, the method you called is still "running" in that it hasn't completed, so you logically have it running in parallel with your calling method
but physically, the other method isn't running, it's suspendedI'm not sure but I think you can improve your code by removing the
ToList()
call. You don't gotta materialize everything in memory before you do the work tho @Kroks Just use a list instead of that array so you don't need the Count property.does that difference matter at all? probably not
was that directed towards me?
no
kk
but kinda, though?
it depends what
GetAllSessionsAsync()
returns, but yeah, I agree that the .ToList()
looks sussy
and also, it probably doesn't make any practical differenceI am aware of the List allocation, but I rlly dont care because I call it only once so should not make a difference imo. Issue is that I am using Dapper, Dapper returns an
IEnumerable
, and I think the .Count() method is slow since it needs to iterate over everything (but you can correct me)I expect GetAllSessionsAsync() to return IQueryable<T> here
doesn't look like it, not if it's being awaited
oh right
might be IAsyncEnumerable then. But especially then he should be processing one item at a time
and not load all into memory
you're correct that you should materialize an
IEnumerable
if you intend to refer to its length, as well as iterate itwhat do you mean by materialize?
.ToList()
Ah ok
or
.ToArray()
or whateverand many more
I need the length in order to allocate the Task array, but I could use a list of tasks too. But at the end it shouldnt rlly matter, that function gets called only once
Flroian is also correct that the ideal should be to have your data access layer return
IAsyncEnumerable<T>
then you can just do...never used AsyncEnumerable
im like coding the first time rlly with async stuff
or something like that
not sure if the usage of an AsyncEnumerable is really possible (or I am misunderstanding the concept behind it), the DB.GetAllSessionsAsync fetches data from a SQL database
what does that return
IEnumerable
no
is that an EF/EFCore dbcontext?
it can't possibly return that if you're awaiting it
have it return IQueryable<T> then
he already said it's dapper
oh
never used dapper, dont know it
me neither
sec
Ah
yeah its a task
of course
yeah
so
this is ssying "I'm going to retrieve all the records from the database, and I'll hand them all to you when I'm done"
which is a design flaw on the part of Dapper, as far as I'm concerned
Dapper HAS to build a
List<T>
or Array<T>
internally, it shouldn't then hide that from you
unless Dapper is doing something WAY worse, and mixing async and blocking
so that they DON'T actually wait for all the records to be retireved before returning,
they return as soon as they can, and then make the IEnumerable
block on .MoveNext()
for my own sanity, I have to assume not
having to wait for all of the records to be pulled into memory before you can start your own processing is inefficientthey should have used IAsyncEnumerator<T> and MoveNextAsync() tho
uhm
yes
that's what
IAsyncEnumerable
has
if that were to return just IAsyncEnumerable
, that would be saying "I am going to retrieve all these records for you, and I'll hand over each one as soon as I get it"
so, with the right usage of that construct, you could be having your CreateSessionAsync()
calls starting right-away, even while more records are still comingnfrom the database
and then, same deal, you would just not await
those calls, and instead collect all those Task
s to pass to Task.WhenAll()
at the end
essentially
Yeah I see, weird then
That dapper does not use asyncenumerables
maybe there is a better alternative to dapper then, but dapper is convenient af
there's definitely good alternatives to Dapper
Dapper has the advantage of being simple
bit confused on why dapper does it like this tho
first pic is in QueryAsync method
thats true
oh dear lord
it is, in fact, returning an
IEnumerable<T>
that blocks
at least, in some scenario
if you specify that the command should not be buffered
ugh
that's such an awful thing to allow consumers to doCan you explain why they cannot use ReadAsync when using an IEnumerable?
dont get it
i think thats default xd
actually its buffered by default
otherwise that would be horrible wtf
yeah
so
ReadAsync()
is an async method
IEnumerable.MoveNext()
is not
so
you can't call async methods in sync code, without blocking
at least
not if you care about the return value from the methodguess AsyncEnumerables werent a thing when Dapper was created, seems like it is not rlly maintained anymore
I will search for an alternative maybe
yes, they're relatively new
I mean, there's nothing wrong with using it
even the blocking isn't REALLY all that bad
it's a performance issue, like any other
and it's one of those that can have a significant impact, and its very easy to avoid, so general wisdom is to just avoid it
but general wisdom for any performance issue is also "build now, optimize later"
and only when you have a real, demonstrable, performance issue, not just a worry
if you're just making dumb little stuff for yourself, buffering all the records into memory is about as significant of a performance hit as a blocking
IEnumerable<>
which is to say, negligible
might have some own implementation now, tho not sure if this is optimal
nah, that looks right
as far as the async use, anyway
Connection.OpenAsync()
could be a problem, if the connection is already openI close it after each query, using should dispose it
but
since im working async
it might be an issue yeah
and I mean, me personally, I would name it
AsyncEnumerateEntries()
, but yeah
or at least EnumerateEntriesAsync()
not sure if its a bad practice to just let it be opened
the fact that you close the connection after every single query is probably an issue in itself
it depends how long it stays open
until the application is closed
by the user
that would be bad
general practice is
create a new connection per "high-level action"
if you want something a little more specific, maybe
per "time period when the user is not in control"
so, like
if the user clicks a button
that might trigger one database query
that might trigger several
before the user gets control back to click the next button
maybe the button was an "Open" button
which means you might do a query to save some existing records first, then another to retrieve new records
or maybe it's a "Create" button, which involves inserting records into multiple tables
now, on top of that
it's also common to implement connection pooling, under the hood
where you create a new
Connection()
object each time you need one, as described above
and that may or may not reuse an actual connection to the database that was already open
and that's usually configurable, to say things like
"never open more than X connections at once"
and "close pooled connections after Y minutes of idle time"I see. Makes sense
btw this is the preffered way now?
pretty much, yeah
you could be clever and do that with Linq extensions
but functionally, that's about as optimal as you can get with this
I see, good to know. Pretty useful to do things just in time instead of waiting for all rows to be parsed and then reiterate the entire list again
No useless allocation
@AnievNekaj I came up with this:
I wonder if we can combine the pros of your code (processing one item from db at a time) with the pro of my code (paralellizing the creation of sessions) and how that would look?
actually cant we just do this and it does the same thing?
What is the advantage here? I know you said
paralellizing the creation of sessions
but since im not awaiting the Tasks sequentially isn't is some kind of parallelism already?
ah nvm I see now
So basically the loop is not sequential in your example
but yeah then I guess you cannot combine it with AsyncEnumerable properlyWas this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.