Modular•12mo ago

Will mojo support JIT compilation?

From what I understand mojo does have already some JIT capabalities (like notebooks) but i'm talking about things like being able to pass run time variables to parameters (not arguments) in functions.

18 Replies

Stijn•12mo ago

moosems_yeehaw•12mo ago

I’m pretty sure it already does, just through the REPL

Stijn•12mo ago

Could you share an example?

moosems_yeehaw•12mo ago

gryznar•12mo ago

JIT compilation could be really useful to handle computations on floats. During compilation, Mojo is using infinite precision, so specifying some runtime code to be jitted could be awesome 🙂

moosems_yeehaw•12mo ago

I imagine it would significantly increase the binary size though because there has to be said compiler to JIT it so it would come with a trade off

NKspartanOP•12mo ago

yeah but i'm talking about things like being able to pass run time variables to parameters (not arguments) in functions. Like right now there isn't a way to do this type of things, from what i know

moosems_yeehaw•12mo ago

Got it +1

sora•12mo ago

Mojo supports just-in-time compiled in the sense that the only the compiled code runs even in a REPL. PyPy etc. style "run time specialisation" is fundamentally incompatible with Mojo's compilation strategy, I think.

NKspartanOP•12mo ago

you think. I'm just asking about this because i feel like if mojo wants to be an AI language I think things like being able to pass run time values to parameters should be possible, because the reason jax, pytorch and others can do many things while the code is easy and simple to write, is thanks to this no?, like being able to use the python as the runtime and from there doing compiling to create efficient code. But maybe i'm wrong because i don't understand how the max engine works so because i don't know they do things in the max engine maybe that's why I think it is necessary.

sora•12mo ago

@NKspartan Ah, I see. If what you have in mind is JAX, then it's a bit different. It's a domain specific compiler, MAX is like that. It's kinda a separate compiler. JAX uses XLA as the backend compiler, which is written in C++. But it's clear that the JIT related feature is not built into C++ the language, right? Similarly, the DS compiler can be built with Mojo, but I think it's the best to not see it as part of "the mojo compiler".

NKspartanOP•12mo ago

okay so max is also a separate compiler you are saying (that's what i was somewhat thinking), in the same way jax uses xla, okay. So now my question is how could i write that in mojo hahah. But thank you for you help sora as always.

sora•12mo ago

IIUC, MAX has an additional advantage: it can compile/fuse your ML model code on the lower level as well. Consider this code:

fn f(x):
  return 2 * sin(x) + cos(y)

fn f(x):
  return 2 * sin(x) + cos(y)

the front end might lower it to something like this

module {
  func @f(%arg0: tensor) -> tensor {
    %sin = math.sin %arg0
    %cos = math.cos %arg0
    %double_sin = arith.mulf %sin, constant(2.0 : tensor)
    %result = arith.addf %double_sin, %cos
    return %result : tensor
  }
}

module {
  func @f(%arg0: tensor) -> tensor {
    %sin = math.sin %arg0
    %cos = math.cos %arg0
    %double_sin = arith.mulf %sin, constant(2.0 : tensor)
    %result = arith.addf %double_sin, %cos
    return %result : tensor
  }
}

A ML compiler might simplify it further into something like

module {
  func @f(%arg0: tensor) -> tensor {
    %sin = math.sin %arg0
    %cos = math.cos %arg0
    %double_sin = arith.addf %sin, %sin  // this may or may not be beneficial
    %result = arith.addf %double_sin, %cos
    return %result : tensor
  }
}

module {
  func @f(%arg0: tensor) -> tensor {
    %sin = math.sin %arg0
    %cos = math.cos %arg0
    %double_sin = arith.addf %sin, %sin  // this may or may not be beneficial
    %result = arith.addf %double_sin, %cos
    return %result : tensor
  }
}

Now, the runtime provided by XLA will lower this further and call the corresponding prewritten kernels (in C++) One can imagine the C++ implementation of sin and cos sharing some intermediate results. However, since the DSL compiler cannot see through their implementations, it cannot reason across the 'kernel' boundaries, nor can it further fuse them. MAX, on the other hand, does not have this limitation. It benefits from techniques such as Common Subexpression Elimination (CSE), even at the lowest level, because the implementation of the kernels themselves goes through MLIR

NKspartanOP•12mo ago

Okay so even max can have more optimization thanks to also lowering the kernels to mlir. One question the first mlir code did you write it manually or did you get from running the function in mojo, because I wanted to know how to see the compiled mlir code from mojo

sora•12mo ago

As far as i know, it's not a feature we can use yet. Sad.

NKspartanOP•12mo ago

Oh okay, that’s what I thought. Well thank you for the explanation anyway

sora•12mo ago

It's pseudo MLIR code.

moosems_yeehaw•12mo ago

This feels like it should be a builtin argparse like feature. I think the reason this doesn’t work is because parameters are treated like known values at compile time so it will try to use an unknown value for optimization purposes and inevitably fail JIT’d argv would be incredibly useful

Gaming

Programming

Will mojo support JIT compilation?

Did you find this page helpful?