Mojo for-loop performance
https://github.com/sstadick/rust-vs-mojo-loop
While profiling other code, trying to get my perf to match Rust, I noticed that my vanilla for-loop seemed to be one large source of difference. I'm not great with assembly, but looking at what was generated it seemed like Rust was able to skip bound checks when indexing into the arrays since the length of the array was/is given in the range.
Has anyone else ran into this? While this is a toy example, I've run into in more complex scenarios and with real data as well.
The two programs in question, repo has benching script:
GitHub
GitHub - sstadick/rust-vs-mojo-loop
Contribute to sstadick/rust-vs-mojo-loop development by creating an account on GitHub.
2 Replies
Possibly related, I posted a bug demonstrating the difference in peref when using
range(start, end)
vs just range(end)
: https://github.com/modularml/mojo/issues/3931
Is range somehow getting in the way of optimizations?GitHub
[BUG] Iteration using Range without providing a start is slower tha...
Bug description Iterating over a List or Span is slower when using Range(len(list)) than when using either Range(0, len(list)) or the direct for value in list. Below is a minimal reproducible examp...
Possibly largely answered by this: https://discord.com/channels/1087530497313357884/1151418895417233429/1326217184963334217
Since it's comparing two binaries running, if Mojo has a slow startup time, that could be it.
Startup is slower, but it doesn't explain the full delta between Rust and Mojo in the above programs, especially when the loops are large.