Fastest way to count trailing zeros from a SIMD[Bool,32]?

The summary says it all. I have a SIMD with 32 bools, and would like to find the first appearance of a True value. Right now I'm doing the following: TRUE_CASE = SIMD[DType.uint8,32](0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31) FALSE_CASE = SIMD[DType.uint8,32](32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32) SIMD[Bool,32].select(TRUE_CASE, FALSE_CASE).reduce_min()) I would like to cast the SIMD[Bool,32] into an Int32 and then count trailing zeroes. But have been unable to do the conversion.
6 Replies
ModularBot
ModularBot3mo ago
Congrats @asosoman, you just advanced to level 1!
Darkmatter
Darkmatter3mo ago
The following function works well for 32 bit, but you may have to change some things to make wider inputs work.
fn first_true_index(input: SIMD[DType.bool, 32]) -> UInt32:
return count_leading_zeros(bitcast[DType.uint32, 32](input))
fn first_true_index(input: SIMD[DType.bool, 32]) -> UInt32:
return count_leading_zeros(bitcast[DType.uint32, 32](input))
Output assembly:
vgf2p8affineqb $0, .LCPI2_1(%rip){1to4}, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
lzcntl %eax, %eax
vzeroupper
retq
vgf2p8affineqb $0, .LCPI2_1(%rip){1to4}, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
lzcntl %eax, %eax
vzeroupper
retq
asosoman
asosomanOP3mo ago
Thank you very much, It seems I missed the bitcast function. (Looking basically at cast method from SIMD)
Manuel Saelices
Manuel Saelices3mo ago
I guess we cannot use the count_leading_zeros with bools because of this:
No description
Manuel Saelices
Manuel Saelices3mo ago
I don't know if we should allow it directly without bitcast
Darkmatter
Darkmatter3mo ago
simd of bool is not bit-packed to avoid the std::vector<bool> mistake, so without using a bit-pack instruction first you end up with 31 or 32 on x86 and 63 or 64 on ARM.
Want results from more Discord servers?
Add your server