C
C#15mo ago
Kroks

❔ Is this usage of Tasks equivalent to Threads?

I have implemented this using Tasks. My goal is to load the sessions in parallel since they are not dependent on each other. I could have used threads, but I came up with this solution instead. See how I do not await CreateSessionAsync, I await all tasks at the end. I could have also used Threads to split the session loading. Question is now, is there a major difference?
65 Replies
Kroks
Kroks15mo ago
private static async Task CreateSessionAsync(SessionModel session)
{
var sess = new Session();
await Task.WhenAll(sess.LoadAccountsAsync(session.Accounts), sess.LoadProxiesAsync(session.Proxies));
sess.DelaySettings = session.DelaySettings;
_sessions[session.Name] = sess;
}

public static async Task<ICollection<Session>> LoadSessionsAsync()
{
var sessions = (await DB.GetAllSessionsAsync()).ToList();

var tasks = new Task[sessions.Count];
for (var i = 0; i < sessions.Count; i++)
{
var session = sessions[i];
tasks[i] = CreateSessionAsync(session);
}

await Task.WhenAll(tasks);
return _sessions.Values;
}

private static async Task CreateSessionAsync(SessionModel session)
{
var sess = new Session();
await Task.WhenAll(sess.LoadAccountsAsync(session.Accounts), sess.LoadProxiesAsync(session.Proxies));
sess.DelaySettings = session.DelaySettings;
_sessions[session.Name] = sess;
}

public static async Task<ICollection<Session>> LoadSessionsAsync()
{
var sessions = (await DB.GetAllSessionsAsync()).ToList();

var tasks = new Task[sessions.Count];
for (var i = 0; i < sessions.Count; i++)
{
var session = sessions[i];
tasks[i] = CreateSessionAsync(session);
}

await Task.WhenAll(tasks);
return _sessions.Values;
}

Or does this all take place in one thread?
JakenVeina
JakenVeina15mo ago
logically, they are equivalent, yes you can think of async/await code as executing on a logical thread it behaves, logically, like a thread one statement follows the next, sequentially, and the caller doesn't continue until you return the difference is that your logical thread is actually going to be executing piecemeal, upon any number of physical threads await ing is the equivalent to blocking, on a logical thread the logic can't continue until the Task that you're awaiting completes but unlike physical blocking, await does not block any physical threads it takes the current "state" of the method, bundles it up in a box, and dumps it on the heap, so it can be retrieved later ot "suspends" the method so it can later be "resumed" that way, the physical thread that was being used to execute that method can now be freed up to do something else that's the other magic to realize async/await only operates under the umbrella of a SynchronizationContext, which can be thought of as a type of scheduler when an awaited Task completes, it grabs the state of the suspended method off the heap and sends it to yhe SynchronizationContext to be resumed the context will pick an available thread to execute it on or whatever
Kroks
Kroks15mo ago
Yeah I think I get the concept pretty much Thanks for the write-up
JakenVeina
JakenVeina15mo ago
it'll schedule the method to start back up again based on whatever criteria it wantsnand whatever resources it has access to all that to say the advantage of async/await is to give you all the logical benefits of creating and executing many threads at once, witbout the large overhead that physical threads have rather, without the large overhead of frequently creating and destroying physical threads this can get you both logical and actual parallelism depending on the context your approach of using Task.WhenAll() looks spot-on although it's quite tough to read on mobile when you call an async method, that method executes synchronously, on your thread, until it hits an await at that point it suspends and returns a Task for completion, and you move on to your next method call in a logical sense, the method you called is still "running" in that it hasn't completed, so you logically have it running in parallel with your calling method but physically, the other method isn't running, it's suspended
Florian Voß
Florian Voß15mo ago
I'm not sure but I think you can improve your code by removing the ToList() call. You don't gotta materialize everything in memory before you do the work tho @Kroks Just use a list instead of that array so you don't need the Count property.
JakenVeina
JakenVeina15mo ago
does that difference matter at all? probably not
Florian Voß
Florian Voß15mo ago
was that directed towards me?
JakenVeina
JakenVeina15mo ago
no
Florian Voß
Florian Voß15mo ago
kk
JakenVeina
JakenVeina15mo ago
but kinda, though? it depends what GetAllSessionsAsync() returns, but yeah, I agree that the .ToList() looks sussy and also, it probably doesn't make any practical difference
Kroks
Kroks15mo ago
I am aware of the List allocation, but I rlly dont care because I call it only once so should not make a difference imo. Issue is that I am using Dapper, Dapper returns an IEnumerable, and I think the .Count() method is slow since it needs to iterate over everything (but you can correct me)
Florian Voß
Florian Voß15mo ago
I expect GetAllSessionsAsync() to return IQueryable<T> here
JakenVeina
JakenVeina15mo ago
doesn't look like it, not if it's being awaited
Florian Voß
Florian Voß15mo ago
oh right might be IAsyncEnumerable then. But especially then he should be processing one item at a time and not load all into memory
JakenVeina
JakenVeina15mo ago
you're correct that you should materialize an IEnumerable if you intend to refer to its length, as well as iterate it
Kroks
Kroks15mo ago
what do you mean by materialize?
JakenVeina
JakenVeina15mo ago
.ToList()
Kroks
Kroks15mo ago
Ah ok
JakenVeina
JakenVeina15mo ago
or .ToArray() or whatever
Florian Voß
Florian Voß15mo ago
and many more
Kroks
Kroks15mo ago
I need the length in order to allocate the Task array, but I could use a list of tasks too. But at the end it shouldnt rlly matter, that function gets called only once
JakenVeina
JakenVeina15mo ago
Flroian is also correct that the ideal should be to have your data access layer return IAsyncEnumerable<T> then you can just do...
Kroks
Kroks15mo ago
never used AsyncEnumerable im like coding the first time rlly with async stuff
JakenVeina
JakenVeina15mo ago
Task.WhenAll(myAsyncEnumerable
.SelectMany(...))
Task.WhenAll(myAsyncEnumerable
.SelectMany(...))
or something like that
Kroks
Kroks15mo ago
not sure if the usage of an AsyncEnumerable is really possible (or I am misunderstanding the concept behind it), the DB.GetAllSessionsAsync fetches data from a SQL database
JakenVeina
JakenVeina15mo ago
what does that return
Kroks
Kroks15mo ago
IEnumerable
JakenVeina
JakenVeina15mo ago
no
Florian Voß
Florian Voß15mo ago
is that an EF/EFCore dbcontext?
JakenVeina
JakenVeina15mo ago
it can't possibly return that if you're awaiting it
Florian Voß
Florian Voß15mo ago
have it return IQueryable<T> then
JakenVeina
JakenVeina15mo ago
he already said it's dapper
Florian Voß
Florian Voß15mo ago
oh never used dapper, dont know it
JakenVeina
JakenVeina15mo ago
me neither
Kroks
Kroks15mo ago
sec
Kroks
Kroks15mo ago
Kroks
Kroks15mo ago
Ah yeah its a task of course
JakenVeina
JakenVeina15mo ago
yeah so this is ssying "I'm going to retrieve all the records from the database, and I'll hand them all to you when I'm done" which is a design flaw on the part of Dapper, as far as I'm concerned Dapper HAS to build a List<T> or Array<T> internally, it shouldn't then hide that from you unless Dapper is doing something WAY worse, and mixing async and blocking so that they DON'T actually wait for all the records to be retireved before returning, they return as soon as they can, and then make the IEnumerable block on .MoveNext() for my own sanity, I have to assume not having to wait for all of the records to be pulled into memory before you can start your own processing is inefficient
Florian Voß
Florian Voß15mo ago
they should have used IAsyncEnumerator<T> and MoveNextAsync() tho
JakenVeina
JakenVeina15mo ago
uhm yes that's what IAsyncEnumerable has if that were to return just IAsyncEnumerable, that would be saying "I am going to retrieve all these records for you, and I'll hand over each one as soon as I get it" so, with the right usage of that construct, you could be having your CreateSessionAsync() calls starting right-away, even while more records are still comingnfrom the database and then, same deal, you would just not await those calls, and instead collect all those Tasks to pass to Task.WhenAll() at the end essentially
var tasks = new List<Task>();
await foreach(var session in AsyncEnumerateAllSessions())
tasks.Add(CreateSessionAsync(session);
await Task.WhenAll(tasks);
var tasks = new List<Task>();
await foreach(var session in AsyncEnumerateAllSessions())
tasks.Add(CreateSessionAsync(session);
await Task.WhenAll(tasks);
Kroks
Kroks15mo ago
Yeah I see, weird then That dapper does not use asyncenumerables maybe there is a better alternative to dapper then, but dapper is convenient af
JakenVeina
JakenVeina15mo ago
there's definitely good alternatives to Dapper Dapper has the advantage of being simple
Kroks
Kroks15mo ago
bit confused on why dapper does it like this tho
Kroks
Kroks15mo ago
Kroks
Kroks15mo ago
Kroks
Kroks15mo ago
first pic is in QueryAsync method thats true
JakenVeina
JakenVeina15mo ago
oh dear lord it is, in fact, returning an IEnumerable<T> that blocks at least, in some scenario if you specify that the command should not be buffered ugh that's such an awful thing to allow consumers to do
Kroks
Kroks15mo ago
Can you explain why they cannot use ReadAsync when using an IEnumerable? dont get it i think thats default xd actually its buffered by default otherwise that would be horrible wtf
JakenVeina
JakenVeina15mo ago
yeah so ReadAsync() is an async method IEnumerable.MoveNext() is not so you can't call async methods in sync code, without blocking at least not if you care about the return value from the method
Kroks
Kroks15mo ago
guess AsyncEnumerables werent a thing when Dapper was created, seems like it is not rlly maintained anymore I will search for an alternative maybe
JakenVeina
JakenVeina15mo ago
yes, they're relatively new I mean, there's nothing wrong with using it even the blocking isn't REALLY all that bad it's a performance issue, like any other and it's one of those that can have a significant impact, and its very easy to avoid, so general wisdom is to just avoid it but general wisdom for any performance issue is also "build now, optimize later" and only when you have a real, demonstrable, performance issue, not just a worry if you're just making dumb little stuff for yourself, buffering all the records into memory is about as significant of a performance hit as a blocking IEnumerable<> which is to say, negligible
Kroks
Kroks15mo ago
private static async IAsyncEnumerable<T> GetEntriesAsync<T>(string query, DynamicParameters parameters)
{
await using (Connection)
{
await Connection.OpenAsync();
await using var reader = await Connection.ExecuteReaderAsync(new CommandDefinition(query, parameters));

var parse = reader.GetRowParser<T>();

while (await reader.ReadAsync())
{
yield return parse(reader);
}
}
}
private static async IAsyncEnumerable<T> GetEntriesAsync<T>(string query, DynamicParameters parameters)
{
await using (Connection)
{
await Connection.OpenAsync();
await using var reader = await Connection.ExecuteReaderAsync(new CommandDefinition(query, parameters));

var parse = reader.GetRowParser<T>();

while (await reader.ReadAsync())
{
yield return parse(reader);
}
}
}
might have some own implementation now, tho not sure if this is optimal
JakenVeina
JakenVeina15mo ago
nah, that looks right as far as the async use, anyway Connection.OpenAsync() could be a problem, if the connection is already open
Kroks
Kroks15mo ago
I close it after each query, using should dispose it but since im working async it might be an issue yeah
JakenVeina
JakenVeina15mo ago
and I mean, me personally, I would name it AsyncEnumerateEntries(), but yeah or at least EnumerateEntriesAsync()
Kroks
Kroks15mo ago
not sure if its a bad practice to just let it be opened
JakenVeina
JakenVeina15mo ago
the fact that you close the connection after every single query is probably an issue in itself it depends how long it stays open
Kroks
Kroks15mo ago
until the application is closed by the user
JakenVeina
JakenVeina15mo ago
that would be bad general practice is create a new connection per "high-level action" if you want something a little more specific, maybe per "time period when the user is not in control" so, like if the user clicks a button that might trigger one database query that might trigger several before the user gets control back to click the next button maybe the button was an "Open" button which means you might do a query to save some existing records first, then another to retrieve new records or maybe it's a "Create" button, which involves inserting records into multiple tables now, on top of that it's also common to implement connection pooling, under the hood where you create a new Connection() object each time you need one, as described above and that may or may not reuse an actual connection to the database that was already open and that's usually configurable, to say things like "never open more than X connections at once" and "close pooled connections after Y minutes of idle time"
Kroks
Kroks15mo ago
I see. Makes sense
var tasks = new List<Task>();

await foreach (var session in DB.GetAllSessionsAsync())
tasks.Add(CreateSessionAsync(session));

await Task.WhenAll(tasks);
var tasks = new List<Task>();

await foreach (var session in DB.GetAllSessionsAsync())
tasks.Add(CreateSessionAsync(session));

await Task.WhenAll(tasks);
btw this is the preffered way now?
JakenVeina
JakenVeina15mo ago
pretty much, yeah you could be clever and do that with Linq extensions but functionally, that's about as optimal as you can get with this
Kroks
Kroks15mo ago
I see, good to know. Pretty useful to do things just in time instead of waiting for all rows to be parsed and then reiterate the entire list again No useless allocation
Florian Voß
Florian Voß15mo ago
@AnievNekaj I came up with this:
var tasks = new List<Task>();
var sessions = await DB.GetAllSessionsAsync();
await Parallel.ForEachAsync(sessions, session =>
{
tasks.Add(CreateSessionsAsync(session));
});
await Task.WhenAll(tasks);
var tasks = new List<Task>();
var sessions = await DB.GetAllSessionsAsync();
await Parallel.ForEachAsync(sessions, session =>
{
tasks.Add(CreateSessionsAsync(session));
});
await Task.WhenAll(tasks);
I wonder if we can combine the pros of your code (processing one item from db at a time) with the pro of my code (paralellizing the creation of sessions) and how that would look? actually cant we just do this and it does the same thing?
var tasks = DB.GetAllSessionsAsync()
.Select(s => CreateSessionAsync(s)
.ToList()
await Task.WhenAll(tasks);
var tasks = DB.GetAllSessionsAsync()
.Select(s => CreateSessionAsync(s)
.ToList()
await Task.WhenAll(tasks);
Kroks
Kroks15mo ago
What is the advantage here? I know you said paralellizing the creation of sessions but since im not awaiting the Tasks sequentially isn't is some kind of parallelism already? ah nvm I see now So basically the loop is not sequential in your example but yeah then I guess you cannot combine it with AsyncEnumerable properly
Accord
Accord15mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.