Fastest way to count trailing zeros from a SIMD[Bool,32]?

The summary says it all. I have a SIMD with 32 bools, and would like to find the first appearance of a True value. Right now I'm doing the following: TRUE_CASE = SIMD[DType.uint8,32](0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31) FALSE_CASE = SIMD[DType.uint8,32](32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32) SIMD[Bool,32].select(TRUE_CASE, FALSE_CASE).reduce_min()) I would like to cast the SIMD[Bool,32] into an Int32 and then count trailing zeroes. But have been unable to do the conversion.
6 Replies
ModularBot
ModularBot6d ago
Congrats @asosoman, you just advanced to level 1!
Darkmatter
Darkmatter6d ago
The following function works well for 32 bit, but you may have to change some things to make wider inputs work.
fn first_true_index(input: SIMD[DType.bool, 32]) -> UInt32:
return count_leading_zeros(bitcast[DType.uint32, 32](input))
fn first_true_index(input: SIMD[DType.bool, 32]) -> UInt32:
return count_leading_zeros(bitcast[DType.uint32, 32](input))
Output assembly:
vgf2p8affineqb $0, .LCPI2_1(%rip){1to4}, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
lzcntl %eax, %eax
vzeroupper
retq
vgf2p8affineqb $0, .LCPI2_1(%rip){1to4}, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
lzcntl %eax, %eax
vzeroupper
retq
asosoman
asosoman5d ago
Thank you very much, It seems I missed the bitcast function. (Looking basically at cast method from SIMD)
msaelices
msaelices5d ago
I guess we cannot use the count_leading_zeros with bools because of this:
No description
msaelices
msaelices5d ago
I don't know if we should allow it directly without bitcast
Darkmatter
Darkmatter5d ago
simd of bool is not bit-packed to avoid the std::vector<bool> mistake, so without using a bit-pack instruction first you end up with 31 or 32 on x86 and 63 or 64 on ARM.
Want results from more Discord servers?
Add your server