jon-chuang Comments - Answer Overflow

Topics

jon-chuang

•Created by Kyle H on 10/28/2023 in #questions

Fastest Matrix Multiplication

Is all of the computation done as pure codegen through MLIR?

To my understanding, Mojo takes a compiler-first approach i.e. no hand-tuned kernel lowering. I.e. what you code is what gets compiled/what gets codegen-ed.

20 replies

•Created by Kyle H on 10/28/2023 in #questions

Fastest Matrix Multiplication

@Alex Kirchhoff how might I get access to ai engine to perform an apples to apples comparison? 🙂

20 replies

•Created by Kyle H on 10/28/2023 in #questions

Fastest Matrix Multiplication

PR here: https://github.com/modularml/mojo/pull/1280

20 replies

•Created by Kyle H on 10/28/2023 in #questions

Fastest Matrix Multiplication

Hi Alex, I improved the matmul example's perf by 2.3x by implementing tile swizzling (L3 cache), better unrolling and better loop ordering. Should I make a PR to improve the example? Would be keen to see how close it's gotten to AI engine perf 🙂

20 replies