Henk-Jan Lebbink Posts - Answer Overflow

Henk-Jan Lebbink

•Created by Henk-Jan Lebbink on 2/29/2024 in #questions

How to tell Mojo that something is intended to be constant?

In mojo 24.1.0 (55ec12d6) I get a new warning:

'let' is being removed, please use 'var' instead

I think it is good practice to make explicity my intention that something is constant. How to I do that (in the future) since let is deprecated?

71 replies

MModular

•Created by Henk-Jan Lebbink on 2/10/2024 in #questions

Unnecessary nan-checks: performance issue or missing compile options.

I'm not sure whether this is a performance issue or a feature request. I figured lets ask here first. The issue is a performance regression due to unnecessary nan-check for with (eg.) max and min operations.

from random import random_ui64
from time import now

fn gen_random_SIMD[T: DType, width: Int]() -> SIMD[T, width]:
    var result = SIMD[T, width]()
    for i in range(width):
        result[i] = random_ui64(0, 100).cast[T]()
    return result

fn main():
    let data0 = gen_random_SIMD[DType.float64, 8]()
    let data1 = gen_random_SIMD[DType.float64, 8]()
    
    let start_time_ns = now()
    let data2 = data0.max(data1)  # we interested in how max is handled.
    let elapsed_time_ns = now() - start_time_ns

    print(data2)
    print("Elapsed time " + str(elapsed_time_ns) + " ns")

from random import random_ui64
from time import now

fn gen_random_SIMD[T: DType, width: Int]() -> SIMD[T, width]:
    var result = SIMD[T, width]()
    for i in range(width):
        result[i] = random_ui64(0, 100).cast[T]()
    return result

fn main():
    let data0 = gen_random_SIMD[DType.float64, 8]()
    let data1 = gen_random_SIMD[DType.float64, 8]()
    
    let start_time_ns = now()
    let data2 = data0.max(data1)  # we interested in how max is handled.
    let elapsed_time_ns = now() - start_time_ns

    print(data2)
    print("Elapsed time " + str(elapsed_time_ns) + " ns")

<+278>:   call   0x5470 <clock_gettime@plt>
<+283>:   mov    rbx,QWORD PTR [rsp+0x40]
<+288>:   mov    rax,QWORD PTR [rsp+0x48]
<+293>:   mov    QWORD PTR [rsp+0x70],rax
<+298>:   vmovapd zmm0,ZMMWORD PTR [rsp+0xc0]
<+306>:   vmovapd zmm2,ZMMWORD PTR [rsp+0x100]
<+314>:   vmaxpd zmm1,zmm2,zmm0
<+320>:   vcmpunordpd k1,zmm0,zmm0
<+327>:   vmovapd zmm1{k1},zmm2
<+333>:   vmovapd ZMMWORD PTR [rsp+0xc0],zmm1
...
<+364>:   call   0x5470 <clock_gettime@plt>

<+278>:   call   0x5470 <clock_gettime@plt>
<+283>:   mov    rbx,QWORD PTR [rsp+0x40]
<+288>:   mov    rax,QWORD PTR [rsp+0x48]
<+293>:   mov    QWORD PTR [rsp+0x70],rax
<+298>:   vmovapd zmm0,ZMMWORD PTR [rsp+0xc0]
<+306>:   vmovapd zmm2,ZMMWORD PTR [rsp+0x100]
<+314>:   vmaxpd zmm1,zmm2,zmm0
<+320>:   vcmpunordpd k1,zmm0,zmm0
<+327>:   vmovapd zmm1{k1},zmm2
<+333>:   vmovapd ZMMWORD PTR [rsp+0xc0],zmm1
...
<+364>:   call   0x5470 <clock_gettime@plt>

+298 and +306 load data0 and data1 +314 calculates the maximum of zmm0 and zmm2 and store the result in zmm1 . +320 mask register k1 is set when zmm0 (data0) contains nan-values. +327 the result value (zmm1) is overwritten when the zmm0 was a nan with the value of data1 (zmm2) +333 result value is written back to memory If data0 could contain nan-values, the above assembly would be correct. But when data0 does not have such values, the code has a performance regression, because for every float min/max operations a nan-check is performed. This is something I would like to control in HPC AI workloads. Q: Is this a regression bug or something else (for which i need to make a feature request)?

6 replies

MModular

•Created by Henk-Jan Lebbink on 2/10/2024 in #questions

How to rewrite this code into something not ugly

I need to call shuffle on a parameter mask of different lengths, the following code is the shortest that I could make. Please fill in the dots to appreciate what would happen with width 1024. Any thoughts are appreciated.

fn my_shuffle[T: DType, width: Int, p: StaticIntTuple[width]](v: SIMD[T, width]) -> SIMD[T, width]:
    @parameter
    if width == 8:
        return v.shuffle[
            p[0],
            p[1],
            p[2],
            p[3],
            p[4],
            p[5],
            p[6],
            p[7],
        ]()
    elif width == 16:
        return v.shuffle[
            p[0],
...
            p[15],
        ]()
    elif width == 32:
        return v.shuffle[
            p[0],
...
            p[31],
        ]()
    elif width == 64:
        return v.shuffle[
            p[0],
...   
            p[63],
        ]()
    elif width == 128:
        return v.shuffle[
            p[0],
...
            p[127],
        ]()
    elif width == 256:
        return v.shuffle[
            p[0],
...
            p[255],
        ]()
    else:
        constrained[False]()
        return v

fn my_shuffle[T: DType, width: Int, p: StaticIntTuple[width]](v: SIMD[T, width]) -> SIMD[T, width]:
    @parameter
    if width == 8:
        return v.shuffle[
            p[0],
            p[1],
            p[2],
            p[3],
            p[4],
            p[5],
            p[6],
            p[7],
        ]()
    elif width == 16:
        return v.shuffle[
            p[0],
...
            p[15],
        ]()
    elif width == 32:
        return v.shuffle[
            p[0],
...
            p[31],
        ]()
    elif width == 64:
        return v.shuffle[
            p[0],
...   
            p[63],
        ]()
    elif width == 128:
        return v.shuffle[
            p[0],
...
            p[127],
        ]()
    elif width == 256:
        return v.shuffle[
            p[0],
...
            p[255],
        ]()
    else:
        constrained[False]()
        return v

10 replies

MModular

•Created by Henk-Jan Lebbink on 2/4/2024 in #questions

Prevent inline

Does anyone know a trick or official means to prevent a function from being inlined? Any ugly trick would be fine.

9 replies

MModular

•Created by Henk-Jan Lebbink on 1/4/2024 in #questions

Counter-intuitive operator precedence

While debugging, I spotted a very counter-intuitive operator precedence evaluation. Question: is it just me, or does the following code seem highly susceptible to bugs. If the code is indeed correct, does it serve as a nice example why we should be cautious about omitting parentheses? A warning would have saved me several hours of debugging...

fn test() -> Bool:
    return True

fn main():
    let A: SIMD[DType.int32, 1] = 10
    let B: SIMD[DType.int32, 1] = 10

    if A != B & test(): # assumed operator precedence
        print("A: not expected but observed")
    else: 
        print("A: expected but not observed")


    if (A != B) & test(): # explicit
        print("B: not expected and not observed")
    else: 
        print("B: expected and observed")

fn test() -> Bool:
    return True

fn main():
    let A: SIMD[DType.int32, 1] = 10
    let B: SIMD[DType.int32, 1] = 10

    if A != B & test(): # assumed operator precedence
        print("A: not expected but observed")
    else: 
        print("A: expected but not observed")


    if (A != B) & test(): # explicit
        print("B: not expected and not observed")
    else: 
        print("B: expected and observed")

4 replies

MModular

•Created by Henk-Jan Lebbink on 12/20/2023 in #questions

Does a pragma exist to switch off `mojo format`

I love the convention part of source code formatting rules: teams should not argue (too much) about formatting. Do this discussion once, agree on something that works for everyone (convention). Let the standard tooling handle it. However, teams often agree that it needs to be switched off sometimes. Question: Is there something to switch off sections of code that should not be formatted. A @format on/off would do.

5 replies

MModular

•Created by Henk-Jan Lebbink on 12/17/2023 in #questions

How to cast SIMD when the size is known at compile time

In the following minimal code snippet v cannot be assigned to v2. The error message is: cannot implicitly convert 'SIMD[si32, size]' value to 'SIMD[si32, 32]' in 'let' initializer. But size is known to be equal to 32. Question: how to convince the compiler that I'm a good citizen: trust me, this is ok, everyting will be fine...

fn x(v: SIMD[DType.int32, 32]):
    pass

fn howto[size: Int](v: SIMD[DType.int32, size]) -> SIMD[DType.int32, size]:
    @parameter
    if size == 32:
        let v2: SIMD[DType.int32, 32] = v  # compile error here
        x(v2)

fn x(v: SIMD[DType.int32, 32]):
    pass

fn howto[size: Int](v: SIMD[DType.int32, size]) -> SIMD[DType.int32, size]:
    @parameter
    if size == 32:
        let v2: SIMD[DType.int32, 32] = v  # compile error here
        x(v2)

3 replies

MModular

•Created by Henk-Jan Lebbink on 12/17/2023 in #questions

Help with sort: it does not seem to do anything.

I'm a bit confused, I have a very simple sort example, but it does not do anything at all. I must be doing something wrong, because I cannot image that I'm the first to touch this code. What is going on here? The following code gives as output:

mojo bug1.mojo
before: 0 13 76 46 53 22 4 68 68 94 38 52 83 3 5 53 67 0 38 6 42 69 59 93 85 53 9 66 42 70 91 76
after:  0 13 76 46 53 22 4 68 68 94 38 52 83 3 5 53 67 0 38 6 42 69 59 93 85 53 9 66 42 70 91 76

mojo bug1.mojo
before: 0 13 76 46 53 22 4 68 68 94 38 52 83 3 5 53 67 0 38 6 42 69 59 93 85 53 9 66 42 70 91 76
after:  0 13 76 46 53 22 4 68 68 94 38 52 83 3 5 53 67 0 38 6 42 69 59 93 85 53 9 66 42 70 91 76

code:

from random import random_ui64
from algorithm.sort import sort

fn main():
    alias size = 32

    var data_vec = DynamicVector[SIMD[DType.uint32, 1]](size)
    for i in range(size):
        data_vec[i] = random_ui64(0, 100).cast[DType.uint32]()

    print_no_newline("before: ")
    for i in range(size):
        print_no_newline(str(data_vec[i]) + " ")
    print("")

    sort[DType.uint32](data_vec) # inplace sorting does not seem to work

    print_no_newline("after:  ")
    for i in range(size):
        print_no_newline(str(data_vec[i]) + " ")
    print("")

from random import random_ui64
from algorithm.sort import sort

fn main():
    alias size = 32

    var data_vec = DynamicVector[SIMD[DType.uint32, 1]](size)
    for i in range(size):
        data_vec[i] = random_ui64(0, 100).cast[DType.uint32]()

    print_no_newline("before: ")
    for i in range(size):
        print_no_newline(str(data_vec[i]) + " ")
    print("")

    sort[DType.uint32](data_vec) # inplace sorting does not seem to work

    print_no_newline("after:  ")
    for i in range(size):
        print_no_newline(str(data_vec[i]) + " ")
    print("")

5 replies

MModular

•Created by Henk-Jan Lebbink on 12/16/2023 in #questions

What to use for the permutation mask in SIMD.shuffle?

The following code snippet works; it is however not very convenient, especially when the SIMD width is a parameter.

fn swap[T: DType, s: StaticIntTuple[16]](v: SIMD[T, 16]) -> SIMD[T, 16]:
    let x = v.shuffle[s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8],s[9],s[10],s[11],s[12],s[13],s[14],s[15]]()
    ...

fn swap[T: DType, s: StaticIntTuple[16]](v: SIMD[T, 16]) -> SIMD[T, 16]:
    let x = v.shuffle[s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8],s[9],s[10],s[11],s[12],s[13],s[14],s[15]]()
    ...

Question: does someone know what I'm supposed to pass as mask in:

 shuffle[*mask: Int](self: Self) -> Self

 shuffle[*mask: Int](self: Self) -> Self

What other type could I use as *mask?

2 replies

MModular

•Created by Henk-Jan Lebbink on 12/12/2023 in #questions

Compile Time Binary Tree in Mojo

I'm new to Mojo (who isn't), and I'm trying the language for something else than ML: metaprogramming. So I made a binary tree type that is generated at compile time. Here is a gist with 72 lines Question: does such an approach make any sense? Am I pushing the language too far?

1 replies

MModular

•Created by Henk-Jan Lebbink on 12/8/2023 in #questions

What would be the target-triple to cross-compile for Graviton.

From the MAX Engine FAQ

Does MAX Engine support generic ARM architectures? Yes, both Mojo and MAX Engine support generic ARM architectures like Apple ARM chips. We formally benchmark ourselves on Graviton because it’s the most commonly used ARM chip for server deployments, and our benchmarks are designed to match what users use most often in production.

What would be the target-triple to cross-compile my Mojo sources to ARM for AWS Graviton. I like to decompile the binaries and evaluate the quality of the assembly.

3 replies

MModular

•Created by Henk-Jan Lebbink on 12/5/2023 in #questions

Compilation does not terminate for recursive function

I'm porting some C++ with templates, and I realize that there is no such thing as if constexpr in mojo. Im trying to write something similar to the following test. However, compilation (not surprisingly) does not terminate. Question: how do other people handle these straightforward recursive functions?

fn test[N: Int]() -> InlinedFixedVector[Int, 1<<N]:
    if N == 0: 
        var result = InlinedFixedVector[Int, 1<<N](1<<N)
        result[0] = 0
        return result
    else:
        alias N2: Int = N-1
        let sub: InlinedFixedVector[Int, 1<<N2] = test[N2]()
        let subsize = 1<<N2

        var result = InlinedFixedVector[Int, 1<<N](1<<N)
        for i in range(subsize):
            result[i] = sub[i]
            result[i + subsize] = sub[i] + subsize
        return result

fn test[N: Int]() -> InlinedFixedVector[Int, 1<<N]:
    if N == 0: 
        var result = InlinedFixedVector[Int, 1<<N](1<<N)
        result[0] = 0
        return result
    else:
        alias N2: Int = N-1
        let sub: InlinedFixedVector[Int, 1<<N2] = test[N2]()
        let subsize = 1<<N2

        var result = InlinedFixedVector[Int, 1<<N](1<<N)
        for i in range(subsize):
            result[i] = sub[i]
            result[i + subsize] = sub[i] + subsize
        return result

3 replies

MModular

•Created by Henk-Jan Lebbink on 11/7/2023 in #questions

How to implement the following simple comparison function

I would like to implement the following simple function, but I cannot find an alternative for AnyType (interface?) that defines comparison operators. In the following code a __ne__ is needed for ai != bi. It's as if I'm missing a manual. If someone knows how to do this, or knows how to make me cope that is cannot be done...

fn equal_vector[T: AnyType](a: DynamicVector[T], b: DynamicVector[T]) -> Bool:
    # assumed a and b are sorted
    if a.__len__() != b.__len__():
        return False
    for i in range (a.__len__()):
        let ai: T = a.__getitem__(i)
        let bi: T = b.__getitem__(i)
        if ai != bi:  ## compilation error HERE
            return False
    return True

fn equal_vector[T: AnyType](a: DynamicVector[T], b: DynamicVector[T]) -> Bool:
    # assumed a and b are sorted
    if a.__len__() != b.__len__():
        return False
    for i in range (a.__len__()):
        let ai: T = a.__getitem__(i)
        let bi: T = b.__getitem__(i)
        if ai != bi:  ## compilation error HERE
            return False
    return True

3 replies

MModular

•Created by Henk-Jan Lebbink on 10/19/2023 in #questions

Non-trivial parameter are not matched

What is the expected behaviour while matching parameters. Should they be structurally equal, or is some rewriting on their AST expected? For example, the following code gives the error on the return of doubleCube.

cannot implicitly convert 'SIMD[T, __lshift__(1, __add__(__sub__(N, 1), 1))]' value to 'SIMD[T, __lshift__(1, N)]' in return value

cannot implicitly convert 'SIMD[T, __lshift__(1, __add__(__sub__(N, 1), 1))]' value to 'SIMD[T, __lshift__(1, N)]' in return value

because 1<<((N-1)+1) is not considered equal to 1<<N which suggests threre is no rewriting of parameters.

fn doubleCube[T: DType, N: Int]() -> SIMD[T, (1 << (N+1))]:
    return SIMD[T, (1 << (N+1))](0) #TODO

fn reflectCubeX[T: DType, N: Int]() -> SIMD[T, (1 << N)]:
    if N == 2:
        return doubleCube[T, N-1]()
    else:
        print("ERROR: return identity cube")
        return SIMD[T, 1<<N](0)

fn doubleCube[T: DType, N: Int]() -> SIMD[T, (1 << (N+1))]:
    return SIMD[T, (1 << (N+1))](0) #TODO

fn reflectCubeX[T: DType, N: Int]() -> SIMD[T, (1 << N)]:
    if N == 2:
        return doubleCube[T, N-1]()
    else:
        print("ERROR: return identity cube")
        return SIMD[T, 1<<N](0)

mojo 0.4.0 (9e33b013) Question: is it reasonable that this non-trivial (but not ridiculous complex) parameter 1<<N is matched? Or will this type of template programming be beyond the reach of Mojo?

1 replies

MModular

•Created by Henk-Jan Lebbink on 10/19/2023 in #questions

Type system does not recognize that a literal int is equal to an Int alias

I have a simple struct called Cube:

struct Cube[T: DType, N: Int]:
    var value: SIMD[T, (1 << N)]

struct Cube[T: DType, N: Int]:
    var value: SIMD[T, (1 << N)]

I've simplified the code: what remains is the following issue. (Obviously, in this contrived example, N can just be replaced by 1), but for real-world code, I cannot.

fn reflectN1[T: DType]() -> Cube[T, 1] :
    alias N: Int = 1
    var x : Cube[T, N] = Cube[T, N]()
    x.value[0] = 1
    x.value[1] = 0
    return x

fn reflectN1[T: DType]() -> Cube[T, 1] :
    alias N: Int = 1
    var x : Cube[T, N] = Cube[T, N]()
    x.value[0] = 1
    x.value[1] = 0
    return x

Error: cannot implicitly convert 'Cube[T, N]' value to 'Cube[T, 1]' in return value mojo 0.4.0 (9e33b013) Question: Does anyone know how to cast my x (of type Cube[T, N] to Cube[T, 1]? Or is there another way to get the type system to accept this?

5 replies

Gaming

Programming