Copy-and-Patch compilation in Mojo
Python is going to implement copy-and-patch compilation in 3.13 (https://fredrikbk.com/publications/copy-and-patch.pdf). It seems to be faster and aims to greatly reduce startup time over LLVM -O0. What is Mojo approach to that? Could it be applied somehow or benefits using that are not as high?
16 Replies
Python's JIT compiler will be specifically designed to trade low-ish performance for fast startup and simple code to maintain (mentioned in this talk: https://youtu.be/HxSHIpEQRjs?si=a0zTAb2xlI8XZ13D).
Since Mojo aims to reach the execution speeds of C++ and Rust, I think the Modular team will take a different approach to the Mojo JIT compiler.
encukou
YouTube
Brandt Bucher – A JIT Compiler for CPython
From the 2023 CPython Core Developer Sprint
The QA section is hard to understand; turn on subtitles for our best-effort transcription. (PRs welcome: https://github.com/encukou/subtitles-for-brandts-talk/blob/main/jit.en.vtt)
Links & bibliography:
Slides: https://github.com/brandtbucher/brandtbucher/blob/master/2023/10/10/a_jit_compiler_for_cp...
The main challenge is related to startup time in jitted code which may be much much bigger than in interpreter. I am wondering if Mojo's current approach addresses this issue
After a rough read of the paper, I don’t think Mojo uses or will use similar techniques in its JIT compiler.
Do you see flaws?
Not really flaws. The said compilation model stitches existing codelet together (very high level directly to “binary”), while Mojo seems to embrace progressive lowering using MLIR, which are incompatible IMO.
Binary size could greatly expand with copy-and-patch?
Not just that, it’s not in a form that is fitting for further optimisations anymore. Or, you can’t even spend more time on it to achieve better runtime performance, which is also crucial for Mojo.
I see! Thanks for clarification 😉
Bigger gains could come from e.g. rewriting parser from C++ to Mojo and utilizing hardware during compilation better
Something like that, I guess. Mojo kinda has to go with JIT techniques that doesn’t hinder AOT in any way. I would expect these two flows to share quite a bit of the infrastructure.
Seems very reasonable. Jeff at LLVM talk mentioned, that Mojo compiler is compiling every code object in parallel which looks also very promising 😉
Oddly similar to PyTorch erger mode with out Jit
If Mojo would greatly reduce startup time to innoticeable values in jitted code, it would be a clear winner among others with JIT 🙂
I don’t find mojo startup time to be slow (given my limited time using it on toy size codebases). Do you find the current startup time to be annoying?
It depends 😉 Without optimizations related to strings, compilation + execution could take much more than than in Python 🙂 I've reported issue on this
Ah, I remembered your unrolled print loop slowdown issue.
Yeah, this one. I will check timings again as soon as it will be fixed