Tutorial / Example of targetting GPU with llvm_instrinsic or __mlir_op etc?
Does anyone know of any resources (other than https://docs.modular.com/mojo/notebooks/BoolMLIR.html#adding-functionality-with-mlir), that show examples or methodology behind targetting the GPU for Mojo code?
Modular Docs - Low-level IR in Mojo
Learn how to use low-level primitives to define your own boolean type in Mojo.
14 Replies
@benny the only things public on that at the moment is this from the ModCon keynote: https://www.youtube.com/watch?v=uCVdX5oS34U&t=9s
And the Breakout session here: https://www.youtube.com/watch?v=QD-svwZistc&list=PLh0S94-sJw_4YdNBF998xF7NiaEgbMLh3&index=2
They'll be more coming, but it's still a work in progress
Is the code referenced in the keynote / breakout public or is it just what is included in the video?
Congrats @benny, you just advanced to level 3!
That's right, it's not public yet sorry
All good :) are the features accessible still though things like
__mlir_op.
?I'm not sure if that'd work for the SDK yet
from what i’ve tried it works but error messages are pretty off and the type checking is hard to get around without a bunch of extra calls
does llvm_instrinsic work?
Potentially, you could experiment with https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html
1. Introduction — NVVM IR Specification 12.3 documentation
Reference guide to the NVVM compiler IR (intermediate representation) based on the LLVM IR.
Great blog post by the way!
EXACTLY what I was looking for, thanks Jack :)
thank you!
Would you be able to share how you would call something like fsub? I have tried llvm.fsub, llvm.fsub.f32, llvm.operations.binary.fsub, etc, nothing seems to be working
Perhaps
llvm.nvvm.fsub
, not sure if it'll work for you thoughWhat about llvm.vp.fsub? llvm.nvvm.fsub says not found, but im getting this issue with fneg
let neg_x = llvm_intrinsic["llvm.vp.fneg", SIMD[DType.float32, nelts]](x, SIMD[DType.bool, nelts].splat(True), nelts)
call intrinsic signature float (float, i1, i64) to overloaded intrinsic "llvm.vp.fneg" does not match any of the overloads
I assume this is an issue with SIMD, but I am not sure
Okay, i’ve gotten it working but i’m having a few issues. 1. it is slower than the native implementation cpu (this could be a issue with the code structure though) 2. the most basic operations like fneg, sub, add, are all not prefixed with llvm, therefore they are blocked by the llvm_intrinsic command, is there an alternative?Not that I'm aware of, I haven't ventured down that path yet. A GPU related module for Mojo is coming in the future though, so you can use the language itself. It's just not ready yet.
Perfect Jack, thanks 🔥