SIMD produces weird results without print statement at the end
The code below produces weird results:
For example:
If I uncomment last print statement, suddenly results are correct:
Can someone please explain this behaviour? Am I doing something wrong with SIMD?
12 Replies
I experience the same behaviour on both arm64 and x86
I get similar results
i think uncommenting
#print(nelts, x)
should give something like 4,4,2
To make it work, I uncommeted last print
print(a.simd_load[nelts](0))
I have M1, so it gives nelts = 2. Indexes provided by vectorize
look fine. 0 2 4 6 8
ah yeah
i get
4, 4, 4, 4
for the one i mentioned
and uncommenting the last line doesnt work for meHmm. Interesting. Try to change size to multiple of 4, like 16?
using 16:
uncommenting the last print doesnt change it for me, other than the 2 undefined elements at the beginning which are still there
I see. Surprisingly, my other SIMD implementation for euclidian distance (from Modular blog) is working fine. Maybe I am missing something. When I will have time, I will double check code and maybe look at assembly
Thanks for looking
the inner nelts looks correct for me actually-
4,4,1,1
,
and uncommenting does solve the issue in a certain sense
my bad
but still getting the 2 randos at the beginning:
happens with simdload, inside of a function decorated with @parameter
,
using the tensor after the load solves the issue, so it may have to do with the lifetime
to solve, you can do:
` = a` for each tensor (at the end of main in this case)What do you mean by simd_load inside the function? Should it still display a correct results after vectorize? But I see where you are pointing. Cause in all examples provided by Modular they returned from function at this point.
Congrats @vmois, you just advanced to level 3!
_ = b
also works. _ = c
doesn't 🙂
Oh wait. Just saw your edir
So by using _ = a
I force compiler to provide a visibility for my tensor after vectorize?
Anyway. That is quite interesting. Thank you for the help.no problem