jon-chuang
jon-chuang
MModular
Created by Kyle Hassold on 10/28/2023 in #questions
Fastest Matrix Multiplication
Is all of the computation done as pure codegen through MLIR?
To my understanding, Mojo takes a compiler-first approach i.e. no hand-tuned kernel lowering. I.e. what you code is what gets compiled/what gets codegen-ed.
20 replies
MModular
Created by Kyle Hassold on 10/28/2023 in #questions
Fastest Matrix Multiplication
@Alex Kirchhoff how might I get access to ai engine to perform an apples to apples comparison? 🙂
20 replies
MModular
Created by Kyle Hassold on 10/28/2023 in #questions
Fastest Matrix Multiplication
20 replies
MModular
Created by Kyle Hassold on 10/28/2023 in #questions
Fastest Matrix Multiplication
Hi Alex, I improved the matmul example's perf by 2.3x by implementing tile swizzling (L3 cache), better unrolling and better loop ordering. Should I make a PR to improve the example? Would be keen to see how close it's gotten to AI engine perf 🙂
20 replies