Will mojo support JIT compilation?

From what I understand mojo does have already some JIT capabalities (like notebooks) but i'm talking about things like being able to pass run time variables to parameters (not arguments) in functions.
18 Replies
Stijn
Stijn4mo ago
+1
Three chickens in the green bag
I’m pretty sure it already does, just through the REPL
Stijn
Stijn4mo ago
Could you share an example?
gryznar
gryznar4mo ago
JIT compilation could be really useful to handle computations on floats. During compilation, Mojo is using infinite precision, so specifying some runtime code to be jitted could be awesome 🙂
Three chickens in the green bag
I imagine it would significantly increase the binary size though because there has to be said compiler to JIT it so it would come with a trade off
NKspartan
NKspartan4mo ago
yeah but i'm talking about things like being able to pass run time variables to parameters (not arguments) in functions. Like right now there isn't a way to do this type of things, from what i know
sora
sora4mo ago
Mojo supports just-in-time compiled in the sense that the only the compiled code runs even in a REPL. PyPy etc. style "run time specialisation" is fundamentally incompatible with Mojo's compilation strategy, I think.
NKspartan
NKspartan4mo ago
you think. I'm just asking about this because i feel like if mojo wants to be an AI language I think things like being able to pass run time values to parameters should be possible, because the reason jax, pytorch and others can do many things while the code is easy and simple to write, is thanks to this no?, like being able to use the python as the runtime and from there doing compiling to create efficient code. But maybe i'm wrong because i don't understand how the max engine works so because i don't know they do things in the max engine maybe that's why I think it is necessary.
sora
sora4mo ago
@NKspartan Ah, I see. If what you have in mind is JAX, then it's a bit different. It's a domain specific compiler, MAX is like that. It's kinda a separate compiler. JAX uses XLA as the backend compiler, which is written in C++. But it's clear that the JIT related feature is not built into C++ the language, right? Similarly, the DS compiler can be built with Mojo, but I think it's the best to not see it as part of "the mojo compiler".
NKspartan
NKspartan4mo ago
okay so max is also a separate compiler you are saying (that's what i was somewhat thinking), in the same way jax uses xla, okay. So now my question is how could i write that in mojo hahah. But thank you for you help sora as always.
sora
sora4mo ago
IIUC, MAX has an additional advantage: it can compile/fuse your ML model code on the lower level as well. Consider this code:
fn f(x):
return 2 * sin(x) + cos(y)
fn f(x):
return 2 * sin(x) + cos(y)
the front end might lower it to something like this
module {
func @f(%arg0: tensor) -> tensor {
%sin = math.sin %arg0
%cos = math.cos %arg0
%double_sin = arith.mulf %sin, constant(2.0 : tensor)
%result = arith.addf %double_sin, %cos
return %result : tensor
}
}
module {
func @f(%arg0: tensor) -> tensor {
%sin = math.sin %arg0
%cos = math.cos %arg0
%double_sin = arith.mulf %sin, constant(2.0 : tensor)
%result = arith.addf %double_sin, %cos
return %result : tensor
}
}
A ML compiler might simplify it further into something like
module {
func @f(%arg0: tensor) -> tensor {
%sin = math.sin %arg0
%cos = math.cos %arg0
%double_sin = arith.addf %sin, %sin // this may or may not be beneficial
%result = arith.addf %double_sin, %cos
return %result : tensor
}
}
module {
func @f(%arg0: tensor) -> tensor {
%sin = math.sin %arg0
%cos = math.cos %arg0
%double_sin = arith.addf %sin, %sin // this may or may not be beneficial
%result = arith.addf %double_sin, %cos
return %result : tensor
}
}
Now, the runtime provided by XLA will lower this further and call the corresponding prewritten kernels (in C++) One can imagine the C++ implementation of sin and cos sharing some intermediate results. However, since the DSL compiler cannot see through their implementations, it cannot reason across the 'kernel' boundaries, nor can it further fuse them. MAX, on the other hand, does not have this limitation. It benefits from techniques such as Common Subexpression Elimination (CSE), even at the lowest level, because the implementation of the kernels themselves goes through MLIR
NKspartan
NKspartan4mo ago
Okay so even max can have more optimization thanks to also lowering the kernels to mlir. One question the first mlir code did you write it manually or did you get from running the function in mojo, because I wanted to know how to see the compiled mlir code from mojo
sora
sora4mo ago
As far as i know, it's not a feature we can use yet. Sad.
Want results from more Discord servers?
Add your server