❔ Multithreaded application blocked a lot (no locking) [Answered]
I have a program that loads a data from disk to byte array, so IO operation just happens once. Then i deserialize this data using all of my CPU cores/threads (12). This deserialization operation involving a lot of heap allocation. All of my thread workers operates isolated (almost zero-lock) but all of them reads from same single byte array i previously mentioned to deserialize it.
I was expecting my program to use full power of my CPU but it didn't. So i try to profile it using Concurrency Visualizer, then i discovered a lot of patch of blockings in all of my worker threads. I also discovered while all of my worker threads are blocked, there is always a single random worker threads that executing alone. I don't know why is that as i previosly said, my program operates isolated, no locking.
9 Replies
Since i still got no answer after few hours, i posted this question in StackOverflow if you want to check it out:
https://stackoverflow.com/questions/74460873/multithreaded-deserializing-bottlenecked-by-gc
Stack Overflow
Multithreaded deserializing bottlenecked by GC
I have a program that loads a data from disk to byte array, so IO operation just happens once, no IO bottleneck. Then i deserialize this data of byte array using all of my CPU cores/threads (12). T...
Unknown User•3y ago
Message Not Public
Sign In & Join Server To View
Have you run memory profiler to see where all the allocations and collections are coming from?
Run the program...take a snapshot....let it run longer...then take another snapshot. It'll show you allocation diffs and such
What I'm a bit confused about is why you are multithreading deserialization. It will be I/O bound, so if you just stream the deserialization it'll go just as fast single threaded as it would MT
Actually it should be faster than preloading all the data and then MT deserializing it because you're wasting CPU cycles between I/O waits doing nothing while you preload
Also, did you try server GC?
You said
The memory subsystem is working really, really hard to allocate memory for all the NbtXxx wrapper objects, and then to garbage-collect them.but wait, my NbtXXX object is needed and it is referenced, stored somewhere (e.g store it in a list), heap allocation while deserialization is optimized (e.g prefer Span over new bytes whenever possible) so yes, there is unknown garbage allocation somewhere, but never mind anyway... The problem solved by following one of the answer of this question, which is to change the GC mode to server: https://stackoverflow.com/questions/10292807/multi-thread-cpu-usage-in-c-sharp?noredirect=1&lq=1 Of course it does not come at free of cost, now my program uses too much memory but it is blazingly fast, it use 100% CPU power of my machine. Now i need to worry for short memory available and make clever use of available resource (doesnt matter for my machine, but for other with 4GBs ram for example).... or i could just move all my critical codes (like this) to C++ and make manual memory management myself, do PInvoke interop etc etc
Stack Overflow
multi-thread CPU usage in C#
My Program uses predetermined number of threads that each do independent work.
I use i7-2600 CPU but I shut down the hyper-thread module so it runs 4 threads on 4 cores.
When I run the program with 1
I think you didn't read the question description thoroughly: IO operation just happens once so no, IO is not the bottleneck but whatever, i will try to do what you suggested and thank you for taking your time here: use memory profiler, take a snapshot, let it run longer, repeat
No I read it. Still not sure why you are reading it all first then multithreading the deserialialization. What I'm telling you is that single threaded streaming deserialization will be faster.
✅ This post has been marked as answered!
Someone just had a similar problem and said that using server GC made a big difference so you should try that if you haven't.
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.