io_uring

Hi all, I just published my implementation of the io_uring library in pure Mojo: https://github.com/dmitry-salin/io_uring The design is similar to https://github.com/bytecodealliance/rustix Currently there is only a linux_raw backend, which could potentially be used to create binaries that do not depend on libc. The library is at an early stage and lacks documentation, many operations and tests. But the basic functionality is implemented.
86 Replies
toasty
toasty2mo ago
@Darkmatter I've seen you reference io_uring a few times, this might be of interest for you!
Darkmatter
Darkmatter2mo ago
I’ve seen it but was waiting to collect my thoughts. Jens Axboe (io_uring maintainer) recommends going through liburing because of how tricky the memory ordering is. I’m not sure this is fully sound on ARM due to the weaker memory model there.
ModularBot
ModularBot2mo ago
Congrats @Darkmatter, you just advanced to level 12!
Darkmatter
Darkmatter2mo ago
I was waiting for C interop to use that method to make my own set of bindings that are a bit higher level.
Dmitry Salin
Dmitry Salin2mo ago
Logically the implementation mostly matches liburing, for instruction level matching we need this - https://github.com/modularml/mojo/issues/3162 I have a similar library implemented in Rust and it doesn't cause any problems for my workloads. Most Rust and Zig projects I've seen use native implementations rather than bindings. I profiled with valgrind and the native implementation was better.
GitHub
[Feature Request] Add basic atomic instructions: atomic_load, `at...
Review Mojo's priorities I have read the roadmap and priorities and I believe this request falls within the priorities. What is your request? There is already __mlir_op.pop.load[alignment](addr...
Darkmatter
Darkmatter2mo ago
The lack of fences was what I was commenting on. ARM’s new “big machine” extensions do fun things on multi-socket systems if you don’t fence properly and have a queue that goes across NUMA domains. You might be able to guess why they’re on my mind 😭 I ran into this issue with Rustix on an ARM server last week. One hell of a debugging session. It’s probably fine on x86, ARM v8 and single-socket ARM v9. Which covers most people.
Dmitry Salin
Dmitry Salin2mo ago
If I understand correctly, Rust relies on LLVM for all atomic instructions. Then there should be problems not only with the io_uring. Explicit fence is required for SQPOLL mode, which is a pretty specialized thing. Jens always said that you have to think before you use it.
Darkmatter
Darkmatter2mo ago
SQPOLL has its share of footguns, but allowing the user to avoid syscalls entirely is very powerful. I tend to use SQPOLL in most of what I write with io_uring because it makes benchmark numbers look good. It’s a bit harder in less controlled environments, but for managed services it’s also very useful.
Dmitry Salin
Dmitry Salin2mo ago
My Rust library has fence in the same place as liburing, but I haven't tested it on ARM.
Darkmatter
Darkmatter2mo ago
I’ll double check, phone code review may have failed me.
Dmitry Salin
Dmitry Salin2mo ago
GitHub
Fix memory ordering in sq_ring_needs_enter · axboe/liburing@744f415
A full memory barrier is required between the store to the SQ tail in __io_uring_flush_sq and the load of the flags in sq_ring_needs_enter to prevent a situation where the kernel thread goes to sle...
Darkmatter
Darkmatter2mo ago
Yes, that fence. I'd say until you can put it there for Mojo, I'd comment out the SQPOLL flag and leave a comment linking to the PR and explain that until you can add fences SQPOLL is unsound. x86 it might happen to work because of how the ring is designed. But, you're at the mercy of the cache coherence algorithm deciding it has nothing better to do except sync that cache line.
Dmitry Salin
Dmitry Salin2mo ago
Yes, that make sense. I think I will add constrained preventing compilation in SQPOLL mode
Darkmatter
Darkmatter2mo ago
I think there will need to be later work on a high-level API on top of io_uring for general use, since we need something that expresses the ownership transfer of the buffers properly. io_uring as a primary io API is something I want to push for, but it will mean convincing people to break from the traditional io model of "you provide the buffer" for everything.
Dmitry Salin
Dmitry Salin2mo ago
Yeah, that is the hardest part. One example of higher level abstraction - https://tigerbeetle.com/blog/a-friendly-abstraction-over-iouring-and-kqueue
A Programmer-Friendly I/O Abstraction Over io_uring and kqueue
The financial transactions database to power the next 30 years of Online Transaction Processing.
Want results from more Discord servers?
Add your server