C
C#โ€ข11mo ago
ethanrox

IO threads - why need them?

Does anybody have a blogpost/article that explains in a deep way how IO threads work (can be .net framework resource). Can they mix with worker-threads in the execution of a single task that has both IO calls and also does compute-bound work?
79 Replies
jcotton42
jcotton42โ€ข11mo ago
$nothread perhaps? @tinmanjk
MODiX
MODiXโ€ข11mo ago
There Is No Thread
This is an essential truth of async in its purest form: There is no thread.
abyssptr
abyssptrโ€ข11mo ago
depends on the platform ๐Ÿ˜„ I believe on linux for async io that works on top of epoll, there's a separate thread(s) that will do polling otherwise there should be no .NET thread, yes
ethanrox
ethanroxOPโ€ข11mo ago
I've read it...not really satisfied with it. Maybe somebody who knows how the internal implementation of the ThreadPool class actually works. A bit above my paygrade to get into the internal workings - tried a bit of DnSpy to get the flow of method calls but still not enough to understand what happens.
abyssptr
abyssptrโ€ข11mo ago
Another way to do it is to clone dotnet/runtime repo and browse the ThreadPool.cs and adjacent files. If you have Rider, you can step through implementation in a quite detailed way too when you do F12 on things like File.ReadAllBytesAsync() and whatnot. There you will see that in case of Unix the asynchronous read/write will submit an operation that then will be handled by IO thread(s)(?) working with epoll. In case of Windows, it, instead, will be dispatched with overlapped IO which is a native async IO API on Windows (but Unix is way faster with NVME haha). In case of synchronous operations, most of the time, the p/invoke calls to kernel space will be done directly same as if you would do from C, in which case whether this is being executed by a threadpool worker thread or not does not really matter, since for all intents and purposes, for the threadpool, it just executes some code, whether it does IO or not.
ethanrox
ethanroxOPโ€ข11mo ago
I think that touches on my initial lack of understanding very well. I tried simulating some IO from within a QueueUserWorkItem and it was executed on the worker thread.
ThreadPool.QueueUserWorkItem(_ =>
{
//...
string content = System.IO.File.ReadAllText(path);
// .. compute bound
});
ThreadPool.QueueUserWorkItem(_ =>
{
//...
string content = System.IO.File.ReadAllText(path);
// .. compute bound
});
csharp I was wondering how actually the ThreadPool decided to employ IO threads at all given a callback such as this. A comment to my stackoverflow question for this (https://stackoverflow.com/questions/78066998/how-does-the-threadpool-decide-which-type-of-thread-to-use-for-a-work-item?noredirect=1#comment137626444_78066998) by Hans Passant meant that it was always a worker thread with QueueUserWorkItem. So, I don't know what criteria are used to classify work items as IO items since in my case IO work is done on the worker thread. Who employs the IO thread in the first place to do what work is my actual question I guess.
abyssptr
abyssptrโ€ข11mo ago
Who employs the IO thread in the first place to do what work is my actual question I guess.
The (platform-specific) implementation of IO. By IO here I mean a very specific thing: File API, socket and all kinds of networking API, possibly even IO done entirely inside a user-implemented third-party library.
ethanrox
ethanroxOPโ€ข11mo ago
yes but they do that by using the ThreadPool either directly or via TPL, so the ThreadPool has to make the final decision or am I mistaken?
abyssptr
abyssptrโ€ข11mo ago
There's no magic "this thread does IO" vs "this thread does not do IO" thing. It's an umbrella term for what a particular implementation ends up doing. Thread pool does not know or do that. Threadpool is just a, well, pool of worker threads that execute work items. TPL works on top of that with the default scheduler and somewhat tangential to this. Let's consider overlapped IO which is an API used on Windows. When you issue an asynchronous read of a particular file, it creates an overlapped IO request for a read that will be done asynchronously. What this means is that Windows will see the exact paramters you are issuing a read request, where you specify it is asynchronous, which means that it will queue the read and then immediately return back to you. Then, the code for the method doing this io will just return (yield back) to the threadpool since it can no longer progress. Once Windows finishes the read, it will call the callback .NET has provided within the overlapped IO request which will submit a new work item on the threadpool to continue execution of your code that was awaiting the read. Whether Windows has its own pool of IO threads or some other way of conducting the read - is an implementation detail. In comparison, on Linux, you will have a separate thread(s) (disclaimer: I read the code briefly so it is best you double-check) that exist outside of threadpool worker threads that will perform the polling of the read operation. So when you do an asynchronous read on Linux, your code that is doing an async read and being executed, let's say, by a .NET threadpool worker thread, will submit that to a separate IO thread(s) .NET runtime keeps around to work with epoll, and then immediately return back to threadpool since the execution of your task cannot proceed. Then, once that IO thread(s) has done reading, somewhat similar to how overlapped IO worked with windows, it will queue a new work item on a threadpool that has a reference to your tasks state machine to continue the execution of your task
ethanrox
ethanroxOPโ€ข11mo ago
thanks a lot for the explanation. I was expecting something like this to really happen. worker threads offloading work to IO threads to wait/poll and then when triggered by the OS to give the work back to worker threads
abyssptr
abyssptrโ€ข11mo ago
this is just a (very rough) example of how File API works - networking may or may not use a completely different implementation. after all, it is all object references, threads and method calls all the way down
ethanrox
ethanroxOPโ€ข11mo ago
I am not quite sure I understand really - where is this File API implementation? Or internal thread-IO work with files - the CLR source code?
abyssptr
abyssptrโ€ข11mo ago
1 2
ethanrox
ethanroxOPโ€ข11mo ago
yeah, can do this with File.ReadAllBytesAsync() I prefer DnSpy for some reason, but might need to learn to do this properly with the runtime repo
abyssptr
abyssptrโ€ข11mo ago
I never used DnSpy but would expect it to be a bit difficult to read because RandomAccess IO implementation (public RandomAccess class that takes in file handles and bytes buffers) uses callbacks and all kinds of object wrappers for IO operations, it is kinda hard to read but I cannot stress this enough - there is no magic "IO" flag or one-API-to-rule-them-all that the threadpool uses - it literally just executing opaque code 99% of the time
ethanrox
ethanroxOPโ€ข11mo ago
yeah it is, but you can set break points if you know which methods are used and there is multi-thread debugger, so...can happen
abyssptr
abyssptrโ€ข11mo ago
so when you have to do some quote on quote IO work - it comes down to implementation of that specific """IO""" thing doing that
ethanrox
ethanroxOPโ€ข11mo ago
I know...but if we dig in the File's class InternalAllReadBytes implementation https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/IO/File.cs,f266db58b0db4c05,references it is indeed int n = await RandomAccess.ReadAtOffsetAsync but I doubt that the ReadAtOffSetAsync is using direct Thread API
abyssptr
abyssptrโ€ข11mo ago
like submitting a request to some other separate IO threads that may or may not be implemented in C# or giving a kernel or some unmanaged library a callback that will queue the execution of your task's continuation on threadpool when the operation is done which may do anything internally, the kernel or some library that's just a start of the operation you have to use Rider (which has nice decompiler) or clone dotnet/runtime and browse through an implementation to get the idea how it works (the latter option is better) which does very roughly what was said above (I may be wrong, because I did literally browse through the code to understand how it works more or less)
ethanrox
ethanroxOPโ€ข11mo ago
I am reading the source now at https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/Microsoft/Win32/SafeHandles/SafeFileHandle.ThreadPoolValueTaskSource.cs,74b2b442e61a54df,references and it comes down to
private void QueueToThreadPool()
{
_context = ExecutionContext.Capture();
ThreadPool.UnsafeQueueUserWorkItem(this, preferLocal: true);
}

private void QueueToThreadPool()
{
_context = ExecutionContext.Capture();
ThreadPool.UnsafeQueueUserWorkItem(this, preferLocal: true);
}

That's the IO ReadAllBytes going to SafeFileHandles' methods QueueRead and QueueToThreadPool So the further "IO vs worker" thread distinction needs to happen inside the ThreadPool's private code somehow. File APi -> SafeHandle Api -> ThreadPool Api or am I mistaken?
abyssptr
abyssptrโ€ข11mo ago
clone dotnet/runtime the implementations is different between Unix and Windows
ethanrox
ethanroxOPโ€ข11mo ago
okay, no way around that I guess :))
abyssptr
abyssptrโ€ข11mo ago
you can browse on github too, personally, source.dot.net annoys the shit out of me because you don't navigate files and can't see platform-specific code in a sane way
ethanrox
ethanroxOPโ€ข11mo ago
but you can still navigate...not sure I get the navigation right on github
abyssptr
abyssptrโ€ข11mo ago
GitHub
runtime/src/libraries/System.Private.CoreLib/src/System/IO at main ...
.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps. - dotnet/runtime
abyssptr
abyssptrโ€ข11mo ago
I'm surprised this implementation of IThreadPoolWorkItem looks like it's performing a synchronous read on Unix
abyssptr
abyssptrโ€ข11mo ago
No description
ethanrox
ethanroxOPโ€ข11mo ago
actually, you can navigate...thanks for the link.
abyssptr
abyssptrโ€ข11mo ago
(technically speaking, depending on the amount of concurrent long-waiting read operations this should not be an issue - hill-climbing algorithm will take care of that by scaling up the thread count) (which means it does not use epoll for file api, my mistake, I was remembering it being mentioned in one of the devblogs, maybe it was just an ASP.NET Core one haha)
ethanrox
ethanroxOPโ€ข11mo ago
yeah, I remember the same blog maybe :)) I guess I'd need to read this: https://github.com/dotnet/runtime/blob/6b8d34b9954fabf311594a0ac511a917947f1c92/src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs#L272
private static unsafe (SafeFileHandle.OverlappedValueTaskSource? vts, int errorCode) QueueAsyncReadFile(SafeFileHandle handle, Memory<byte> buffer, long fileOffset,
CancellationToken cancellationToken, OSFileStreamStrategy? strategy)
private static unsafe (SafeFileHandle.OverlappedValueTaskSource? vts, int errorCode) QueueAsyncReadFile(SafeFileHandle handle, Memory<byte> buffer, long fileOffset,
CancellationToken cancellationToken, OSFileStreamStrategy? strategy)
abyssptr
abyssptrโ€ข11mo ago
yes, this should be doing "BeginInvoke" (anyone remembers?:)) create an overlapped IO request + assign a completion callback -> call the kernel API -> kernel API returns immediately after issuing a read -> yield back to threadpool "EndInvoke" kernel calls the completion callback which submits a new work items to the threadpool, which would resume the execution of your task the old APM meme hopefully helps with mental model lol in which case, there will be no blocked or any thread waiting for IO or doing IO work - it is handled inside the kernel in an opaque way, the heap simply has your state machine object of a task that waits for something to call a callback that will continue its execution (I assume overlapped IO interop creates a pinned handle for the callback, which then unpins it when it returns)
ethanrox
ethanroxOPโ€ข11mo ago
I realized I am 5% as knowledgeable as u ๐Ÿ˜„ thanks a lot, but yeah...this should be what'S going on so in this case we circumvent IO threads altogether?
abyssptr
abyssptrโ€ข11mo ago
not really, the difference is probably only in this specific area and can be measured in around 3 hours :kekw: most likely kernel has its own IO threadpool
ethanrox
ethanroxOPโ€ข11mo ago
so back to square 1 for me at least hah ๐Ÿ™‚
abyssptr
abyssptrโ€ข11mo ago
or whatever, after all there is also a concern of a driver implementation and whatnot
ethanrox
ethanroxOPโ€ข11mo ago
I found some DequeueIO methods in another namespace...wonder how they are called
abyssptr
abyssptrโ€ข11mo ago
like, nothing stops you from having a magic userspace implementation that let's say copies data from NVME to a GPU using some low level controller API that bypasses CPU completely and then just raises a hardware interrupt that will call an interrupt handler on some core which then may even directly schedule a continuation on the threadpool (but interrupt handlers are very limited in what they can do so likely waking up some special handler thread first that will do the enqueueing) (this is also only in the case of Windows, Unix/Linux has its own set of APIs to do reads (which works on file descriptors, regardless if it's a file or a socket or anything), then epoll and now also io-uring, all of which can be made to work with async with various degrees of efficiency)
ethanrox
ethanroxOPโ€ข11mo ago
yeah looking for the implementation of the OverlappedValueTaskSource, somehow can't find it in source to see the difference ...but might need to wait a bit for more brain power
abyssptr
abyssptrโ€ข11mo ago
git clone https://github.com/dotnet/runtime -o dotnet-runtime ๐Ÿคทโ€โ™‚๏ธ
abyssptr
abyssptrโ€ข11mo ago
abyssptr
abyssptrโ€ข11mo ago
and then you just open in Visual Studio
ethanrox
ethanroxOPโ€ข11mo ago
I know, I know....just don't want to :)) but I might have to
abyssptr
abyssptrโ€ข11mo ago
why? it's easier, I keep a copy around, it's also useful to examine codegen for arm64 because disasmo needs a built checked variant of runtime for that it's on the better end of a spectrum of "how easy it is to build a runtime for a particular language"
ethanrox
ethanroxOPโ€ข11mo ago
I've got an older laptop and Visual Studio is crazy
abyssptr
abyssptrโ€ข11mo ago
docs may take a bit to find but it's okay otherwise
ethanrox
ethanroxOPโ€ข11mo ago
but I think I'll do the setup on a Remote where I dont care just strange that nobody has really done the digging and explaining in a blog post I am not that familiar with .NET threading tbh, just picked up the "Pro Asynchronous Programming in .NET" and trying to follow closely
abyssptr
abyssptrโ€ข11mo ago
there probably is some information on devblogs
ethanrox
ethanroxOPโ€ข11mo ago
ChatGPT no help...too
abyssptr
abyssptrโ€ข11mo ago
blog posts exist for that it's just they were written some time ago
ethanrox
ethanroxOPโ€ข11mo ago
maybe I'll just leave the topic as a black box for the moment
abyssptr
abyssptrโ€ข11mo ago
since the implementation isn't super new (there isn't that much to change unless a rewrite)
ethanrox
ethanroxOPโ€ข11mo ago
can u point me to some of those? like I find a lot of useful information in older 2001 books for the .NET 1.0 nothing too too much has changed wrt to core stuff
abyssptr
abyssptrโ€ข11mo ago
Adam Sitnik
.NET Blog
File IO improvements in .NET 6 - .NET Blog
Learn about high-performance file IO features in NET 6, like concurrent reads and writes, scatter/gather IO and many more.
abyssptr
abyssptrโ€ข11mo ago
yes it is a better idea to look at code or make sure the material references at least .NET Core 3.1 or newer
ethanrox
ethanroxOPโ€ข11mo ago
obv, I always experiment
abyssptr
abyssptrโ€ข11mo ago
but even then - .NET 6+ has switched to PortableThreadPool implementation written in pure C# only to then again reintroduce WindowsThreadPool which is opt in except NativeAOT targets when you build for Windows
ethanrox
ethanroxOPโ€ข11mo ago
so that's why the class is named like this...before it resided more in the CLR?
abyssptr
abyssptrโ€ข11mo ago
(but you probably shouldn't care too much - it's an internal abstraction that for all intents and purposes acts the same) it may be useful to not bother with old terminology runtime is the clr
ethanrox
ethanroxOPโ€ข11mo ago
like the CLR :))
abyssptr
abyssptrโ€ข11mo ago
well, coreclr
ethanrox
ethanroxOPโ€ข11mo ago
yeah, runtime it is
abyssptr
abyssptrโ€ข11mo ago
today people may refer to it to as just runtime or clr/coreclr to emphasize the difference with mono flavours (be it the one that too lives in dotnet/runtime or the custom fork used by unity)
ethanrox
ethanroxOPโ€ข11mo ago
I really hope somebody updates CLR via C# for modern .NET ... so botr is not that but advertised as that
abyssptr
abyssptrโ€ข11mo ago
plus clr vs mono - these control the compiler being used and the GC being used
ethanrox
ethanroxOPโ€ข11mo ago
got it
abyssptr
abyssptrโ€ข11mo ago
because both would use the same PortableThreadPool implementation on a given platform so you can't probably say that PortableThreadPool is a definitive part of "CLR" either way you should let the old idea of CLR how it was envisioned during .NET Framework days die because the only thing that matters is what we have today and the way it works
ethanrox
ethanroxOPโ€ข11mo ago
what I meant more that it was part of the runtime in native code, but now it's managed seems like trying to make sense of the "Portable" in the name maybe I am wrong
abyssptr
abyssptrโ€ข11mo ago
mostly that and also making implementation unified across windows and unix there is no clear split of "managed bits are C# and strictly non-runtime" vs "unmanaged bits are strictly runtime" some facilities that enable .NET to work as a VM are written in C++ like GC and JIT some are split like type system facilities where parts are in C++ and parts are in C# or odd things like threadlocals which also both tied in to platform-specific parts written in C++ but also have C# implementation details old books may provide misleading mental model
ethanrox
ethanroxOPโ€ข11mo ago
I know, but when I am looking at BCL code there is a lot of calls into unmanaged code ...
abyssptr
abyssptrโ€ข11mo ago
most calls in CoreLib for things like reading files are just calling kernel api
ethanrox
ethanroxOPโ€ข11mo ago
is all unmanaged code in one file or is it split? must be split
abyssptr
abyssptrโ€ข11mo ago
depends on organization in a particular folder/project
ethanrox
ethanroxOPโ€ข11mo ago
I mean the "runtime"
abyssptr
abyssptrโ€ข11mo ago
personally I have a disdain towards superstition-like approach to things, it's not productive there is no magic - it is all implementation details
ethanrox
ethanroxOPโ€ข11mo ago
i.e. C:\Program Files\dotnet\shared\Microsoft.NETCore.App\6.0.13
abyssptr
abyssptrโ€ข11mo ago
the bundling is completely orthogonal to organization in runtime well, it is usually organized by assemblies but you are trying to tackle on too many unrelated topics at once anyway I explained what I knew here, if you want to know more - browse code ๐Ÿคทโ€โ™‚๏ธ
ethanrox
ethanroxOPโ€ข11mo ago
yeah it's offtopic ๐Ÿ™‚ just wondered thanks for the help been doing that a lot, need to continue
ethanrox
ethanroxOPโ€ข11mo ago
thanks again to @abyssptr for the great pointers I think anybody interested in the original question needs to read this file from the source code (Windows case) and the IOCompletionPoller class in particular which is what the IO Thread really is. "// Poller threads are typically expected to be few in number and have to compete for time slices with all // other threads that are scheduled to run. They do only a small amount of work and don't run any user code." (this behavior can be changed and they can run continuation code if UnsafeInlineIOCompletionCallbacks is set to true via en variable...but not typically the case) https://github.com/dotnet/runtime/blob/0a5418d7e7ae7f7653a32004c808881af5275469/src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.IO.Windows.cs#L141
GitHub
runtime/src/libraries/System.Private.CoreLib/src/System/Threading/P...
.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps. - dotnet/runtime

Did you find this page helpful?