Compile time facilities

I am trying to write a function that compares a SIMD variable against a a sequence of other SIMD vectors that are determined by a function parameter and I want the function to do as much as possible at compile time. Ideally so that at runtime the function just has a structure like
If variable == reference1:
return True
If variable == reference2:
return True
...
return False
If variable == reference1:
return True
If variable == reference2:
return True
...
return False
Currently I have it like this
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
var reference: SIMD[DType.uint8, board.size]
@parameter
for row in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for col in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True
...
return False
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
var reference: SIMD[DType.uint8, board.size]
@parameter
for row in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for col in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True
...
return False
But I am not sure if this does everything I want at compile time and if there are other things I can do to do that?
5 Replies
Darkmatter
Darkmatter7d ago
@parameter does loop unrolling, so you've accidently made this function very expensive, and then Mojo is saving you from yourself. This is what you actually want:
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
# compile-time splat
alias reference = SIMD[DType.uint8, size*size](player)
if (board & reference).reduce_bit_count() == size:
return True

...
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
# compile-time splat
alias reference = SIMD[DType.uint8, size*size](player)
if (board & reference).reduce_bit_count() == size:
return True

...
Now, if you had intended to compare each row, it would be like this:
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
# compile-time splat
alias reference = SIMD[DType.uint8, size](player)
@parameter
for row in range(size):
var row_vector = board.slice[size, offset=size*row]()
if (row_vector & reference).reduce_bit_count() == size:
return True

...
fn _is_winner[size: Int, //, player: UInt8](board: SIMD[DType.uint8, size*size]) -> Bool:
# compile-time splat
alias reference = SIMD[DType.uint8, size](player)
@parameter
for row in range(size):
var row_vector = board.slice[size, offset=size*row]()
if (row_vector & reference).reduce_bit_count() == size:
return True

...
JanEric1
JanEric1OP6d ago
So what i have above should be functionally equivalent to the start of this for size = 3: and player=1
# Columns
if (board & (1, 0, 0, 1, 0, 0, 1, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 1, 0, 0, 1, 0, 0, 1, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 1, 0, 0, 1, 0, 0, 1)).reduce_bit_count() == 3:
return True
if (board & (1, 1, 1, 0, 0, 0, 0, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 0, 1, 1, 1, 0, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 0, 0, 0, 0, 1, 1, 1)).reduce_bit_count() == 3:
return True
if (board & (1, 0, 0, 0, 1, 0, 0, 0, 1)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 1, 0, 1, 0, 1, 0, 0)).reduce_bit_count() == 3:
return True
return False
# Columns
if (board & (1, 0, 0, 1, 0, 0, 1, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 1, 0, 0, 1, 0, 0, 1, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 1, 0, 0, 1, 0, 0, 1)).reduce_bit_count() == 3:
return True
if (board & (1, 1, 1, 0, 0, 0, 0, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 0, 1, 1, 1, 0, 0, 0)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 0, 0, 0, 0, 1, 1, 1)).reduce_bit_count() == 3:
return True
if (board & (1, 0, 0, 0, 1, 0, 0, 0, 1)).reduce_bit_count() == 3:
return True
if (board & (0, 0, 1, 0, 1, 0, 1, 0, 0)).reduce_bit_count() == 3:
return True
return False
I just dont know how to best generate these at compile time. (and here is the full current code):
fn _is_winner_param[size: Int, //, player: UInt8](board: BOARD[size*size]) -> Bool:

var reference: SIMD[DType.uint8, board.size]
@parameter
for row in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for col in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check columns
@parameter
for col in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for row in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check main diagonal
reference = SIMD[DType.uint8, board.size](0)
@parameter
for i in range(size):
reference[i * size + i] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check anti-diagonal
reference = SIMD[DType.uint8, board.size](0)
@parameter
for i in range(size):
reference[i * size + (size - 1 - i)] = player
if (board & reference).reduce_bit_count() == size:
return True
return False
fn _is_winner_param[size: Int, //, player: UInt8](board: BOARD[size*size]) -> Bool:

var reference: SIMD[DType.uint8, board.size]
@parameter
for row in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for col in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check columns
@parameter
for col in range(size):
reference = SIMD[DType.uint8, board.size](0)
@parameter
for row in range(size):
reference[row * size + col] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check main diagonal
reference = SIMD[DType.uint8, board.size](0)
@parameter
for i in range(size):
reference[i * size + i] = player
if (board & reference).reduce_bit_count() == size:
return True

# Check anti-diagonal
reference = SIMD[DType.uint8, board.size](0)
@parameter
for i in range(size):
reference[i * size + (size - 1 - i)] = player
if (board & reference).reduce_bit_count() == size:
return True
return False
Darkmatter
Darkmatter6d ago
As far as I can tell what you have now works reasonably well due to constant folding references.
JanEric1
JanEric1OP6d ago
Ok, so there are no other compile time facilities that would allow me to do even more here?
Darkmatter
Darkmatter6d ago
You could potentially do a bit better, but the compiler seems to mostly be figuring it out. You're rapidly approaching the point where knowing the exact CPU you execute on will be more useful than more things at compile time.
Want results from more Discord servers?
Add your server