✅ async/await and parallelism
the problem that spawned this question is that i get a "collection was modified" exception and i don't know why that is
the code is a client for a device, the
i know i can add a
_message
linked list is being accessed in three ways: from a heartbeat loop, from the i/o eventhandler, and from the public methods of the commands -- after an await (also _messages
is never enumerated with a for/foreach)
so i understand that asynchronous model has following properties:
- multithreading: yes, easily verifiable (thread pool, background threads, and all that)
- concurrency: yes, that's the whole point of async/await
- parallelism: no, probably?
and yet in some way in a method that is not even async i get the exception above
so then my question is: considering i don't (explicitly) create ither threads and i don't use Task.Run, is it possible that the use of await method()
could cause some sort of parallelism? or could code get parallel execution by some other practice? (again, excluding creating other threads)i know i can add a
lock
or use a concurrent collection or use a channel, but i am given this tool, async/await, i have a mental model of it, but evidently something is wrong, and i cannot use it if i don't know how it works55 Replies
Can you share code?
code is not that much, but it's still company's code and i don't have a sample
Well the "collection was modified" exception (afaik) occurs due to concurrent enumeration & modification. Are you sure the list is not being enumerated anywhere?
The issue would be accessing the list concurrently. That doesn't necessarily require parallelism, as you say.
So if your code is asynchronous but sequential, i.e. you don't manually initiate parallelism by e.g. kicking of multiple tasks and then calling
Task.WhenXYZ
, you should be fine.
The exception indicates to me that somewhere you access it outside of the sequential code.yes, exception is thrown while this is being executed:
waited = _messages.Any(_ => _ is Type31Message86 or Type30Message86);
but concurrency cannot "interrupt" .Any
i could understand if it was like foreach(_messages) { await something...
, but i don't have thatAh.
.Any
enumerates the list internallysure, but that would not be an issue by my understanding
See here:
Notice the enumerator being accessed and
MoveNext
being called. That likely throws for the linked list enumerator if you add/remove elements between the enumerator being created and MoveNext
being called
If the exception is thrown while enumerating, then that indicates you are concurrently modifying it.yes, but i the only way this could cause problems is if there was another thread accessing _
messages
.Any doesn't return control until it's finished
other parts of the code have to add and remove items from the collection
the matter i don't understand is how can they be running while .Any is being calledOkay
Are the other parts of the code perhaps being executed asynchronously, and a task is in flight when
Any
is called? In that case, it's possible you now have parallel access to the list (the task was scheduled to another thread)so i was trying to understand: can async/await have some sort of parallelism outside of new Thread()?
Yes
but the fact is: every access to _messages is after an await
it has to be async machine that does this
even if a command is called, then the execution goes into a background thread because of await
Is all access to _messages synchronous, even indirectly?
await
doesn't necessarily schedule the continuation to a "background" thread. Consider the case that the await
ed method returns synchronously, then no scheduling will occur.that's true
That's why I was asking about in flight tasks accessing the list, they would by definition be in a bg thread while
.Any
ran.
Did you try setting a breakpoint in a try catch around the Any call and inspect the Parallel Stacks
window for other threads running while Any
excecutes?i'm thinking
i can tell you _messages is never accessed by main thread, i've debugged that
How did you do that?
debug.print thread id
and also _messages is accessed after sending/receiving, so i doubt await would solve synchronously, although i guess i can't exclude it 100%
sure we can say there always two threads around (plus main)
I'd try to make out threads that're accessing the list by checking if their call stack starts in my code somewhere and then see if it modifies the list somewhere. Then I'd have proof of the parallel access.
In the screenshot I broke right on
Main
so I only have one thread.yes bug does not occur often so im trying to think if i recall seeing that
but i think root was a background thread "wake up" method, i don't remember the name now
Maybe you're missing an
await
somewhere and a task thrown away; that could maybe sporadically cause the issuesure, i have to run heartbeat; i either use
_ = HeartBeatLoopAsync();
or i use a timer, but in a way i have to have a task running that (task, not thread)
in bug situation i can see in parallel stacks that there are two threads pointing at the code
one of those is main thread, but i don't trust it because exception has been thrown
i trust what id is being printed by debug.printdoes
_ = HeartBeatLoopAsync();
modify _messages
? And do you access _messages
outside HeartBeatLoopAsync()
? If so, then this is the problem.HeartbeatLoopAsync
awaits HeartbeatAsync
command:
so 1) send request [tcp/websocket] 2) wait for messages to arrive 3) PopMessage
PopMessage
removes the <T> from _message
Which methods would invoke
_message.Any
?it's another command, like HeartbeatAsync
send the request, wait that _messages contains at least a message, call _messages.Any to find if any interesting messages have been parsed (if not, wait again)
So there is a chance for a race condition here. While the other command invokes
_message.Any
(which may enumerate the _messages
, in your case it seems like it does), PopMessage
will try to mutate _message
. It is important to remember that C# is not single threaded, async continuation may happen on a different thread (default thread pool, unless configured otherwise). Also note that waited = _messages.Any(_ => _ is Type31Message86 or Type30Message86);
may iterate over multiple items before the condition is satisfied. The quickest solution is to use a concurrent data structure provided in the language.
from your description and code snippets this is not sequential async.i have to think about it -- tomorrow, after i have slept
in the meantime i just put a lock() around _messages usage because i had to demo it
Yeah if you
_ = HeartBeatLoopAsync();
and then _messages.Any
, the continuation after await SendSimpleAsync(new HeartbeatRequest(), cancellation);
will likely be scheduled to a bg thread since it involves guaranteed IO. So HeartBeatLoopAsync
will return asynchronously when that happens, you'll reach the .Any
call and bang, depending on whether or not .Any
finishes before youcall PopMessage
, you see the excpetion.i'm trying to prove that main thread and whatever the thread of HeartbeatLoopAsync is have nothing to do with this
problem is the bug seem to manifest only the "first time" i start this
Can you post the parallel stacks window?
eh, adding debug stuff now bug doesn't manifest
if/when i get one i'll post it
trying to prove that main thread and whatever the thread of HeartbeatLoopAsync is have nothing to do with thisit seems pretty clear cut to me: 1. guaranteed async return from your socket interaction 2. ignoring the task that represents that 3. continue to
.Any
on thread that called heartbeat function
4. meanwhile the continuation after socket interaction is scheduled to another thread
5. other thread manipulates message list
6. race condition with .Any
since you're working in two threads now
7. exceptioni would argue against 2
the not awaited method starts a loop which sends an heartbeat command which is awaited
but since it's a valid doubt, if i show that the loop thread is not used after the await where the command is sent then it's not its fault
also because it's a task, not a thread, so it still has to respect the thread pool mechanisms
tasks in exception condition (release build)
threads in exception condition (release)
or to be more clear HeartbeatLoopAsync returns at the end of the program
_messages is enumerated here
and manipulated here
i would argue against 2 the not awaited method starts a loop which sends an heartbeat command which is awaitedThe loop method returns asynchronously as soon as you hit the socket, and you throw away the task, so the rest of the loop including any further iterations are executed on a bg thread
in the doubt i don't trust this is main thread because before that i got the thread and it was not main (although it will return to main), maybe it's the exception state that alter this or maybe it's showing the effective thread that started the command
yes, i meant the task returns at the end of the program
i may have kinda understood what's happening, and it's this practically, i believe
although it strikes me as weird
It doesn't matter what thread it is, the fact is that due to the guaranteed async IO in your socket communication, and you throwing away the task that represents the continuation of that, you end up with parallel access to the list.
It's not weird at all, this is the expected behaviour. If you want to sequentialize async operations, you have to
await
them
aka don't throw away the task
or use a synchronization mechanism like lock
or a concurrent collection
if read/write accesses are short, a spinlock could do.
Else, I'd probably look to SemaphoreSlimyes but just because heartbeat loop ends in a background thread i didn't expect this to be executed independently (of the main threadpool)
i don't think this should need a sync primitive, although i may understand why framework does this
The thread that calls the heartbeat loop keeps running synchronously after the loop method returns asynchronously, so it's not available for scheduling continuations when the socket IO is done. So the continuation will be scheduled on another thread from the pool. But all this is also kinda race-y.
i think i can just adopt the functional programming way and copy _messages for heartbeat loop with events so that it has his owned little space to work with, but i would really prefer that this task was on the main thread pool
hmm i can try to await it, tbh (although how would framework know that thread will be awaited, it's not a quantum async machine or a schrodinger's task, so something's not right still, and i don't think gc knowing of a variable with a reference to it would make a difference) (ps: in fact it doesn't)
use 'Lock' to prevent collection modification ..
I don't know why you still think it's a mystery, you have an in-flight task that you ignore that is manipulating the list while the main thread is reading from it at the same time. You have to synchronize both operations, either by sequentializing via
await
or any of the other synchronization mechanisms. Your continuation cannot be scheduled to the main thread because it is busy executing everything after calling the heartbeat loop function, i.e. enumerating the list. Therefore, the async continuation is scheduled on a bg thread.
You could also look at a concurrent producer-consumer setup, with the heartbeat producing messages and some consumer popping them from the collection.you cannot await a loop that doesn't return
i did that temporarily
I was reading again your question.. you should use concorrent ques. if there is multi threaded env. and async/await moreover
I mean list or collection can be of COncurrent type
sure you can
it just won't terminate.
it's not just that it won't terminate, it won't get past the start phase of the device, so the rest of the program will never be exeuted
i want to try to first find a solution that doesn't require synchronization primitives (or collections with synch. primitives)
because i would say this is quite a common pattern and i find it really strange that there is not an already estabilished way to have a timer or an alternative that doesn't cause race conditions
if sadly that fails, which it may be, then there still should be a way to do it so that heartbeat loop and commands remain isolated
I mean
you want to read and write from the list concurrently (possibly in parallel), there is (afaik) no way to do that properly without using a synchronization mechanism of some sort, be it a concurrent collection, p/c pattern, sync mechanism, atomic exchange etc.
Seems to me you're going on a painful path with questionable benefits? What's the issue with using a concurrent collection here? Or better yet, a prod/con implementation like channels?
per my current understading it's really unsatisfying and disappointing that i would have to resort this, because i feel like it should be a much simpler mechanism, and as said before it's a problem so common (having a keep alive while other stuff is running) that i would have expected to find a well established pattern (also tbh concurrency is the greatest of pains, and possibly slow when there's to synchronize stuff, although in this specific case those are not actual issues)
now, i am going through the pain of mental cogs grinding, but i want to do things my way; if in this process of researching i learn that i'm wrong because i was missing something (absolutely possible), then i'll adapt and change
or at least i want to get to the bottom of my reasoning to understand if all the pieces fall together because evidently there's at least one piece on which i have not thought enough
i think im clearing my head