Marvee Amasi Comments - Answer Overflow

Marvee Amasi

•Created by Marvee Amasi on 7/31/2024 in #✅-code-review

How can I optimize matrix multiplication performance and reduce L3 cache misses in my C++ library?

Thanks guys, my matrix operation suffered from poor cache locality, was accessing elements of the matrices in a scattered manner, so I didn't do something complex to fix bottleneck 👌, but it fixed it. Dividing the matrices into smaller blocks to improve cache locality using blocking function, now more data is likely to be found in the processor's cache when needed, reducing the number of costly memory accesses

10 replies

Gaming

Programming