M
Modular•14mo ago
Naroxar

simd_load of Tensor with StaticIntTuple

How is simd_load of Tensor type supposed to work with multiple indices? This example
test = Tensor[float_type](10)
for i in range(10):
test[i] = i

indices = StaticIntTuple[4](1,3,5,6)
vals = test.simd_load[4](indices)

for i in range(4):
print(vals[i], test[indices[i]])
test = Tensor[float_type](10)
for i in range(10):
test[i] = i

indices = StaticIntTuple[4](1,3,5,6)
vals = test.simd_load[4](indices)

for i in range(4):
print(vals[i], test[indices[i]])
gives
6.0 1.0
7.0 3.0
8.0 5.0
9.0 6.0
6.0 1.0
7.0 3.0
8.0 5.0
9.0 6.0
It simply seems to take the last entry of "indices" and load the next 4 elements. It is not clear (to me at least 🙂 ) what this is supposed to give from the documentation.
1 Reply
Alex Kirchhoff
Alex Kirchhoff•14mo ago
simd_load will always load a contiguous range of values from the input tensor. The StaticIntTuple overload is intended to reference different dimensions of a multidimensional tensor, not different elements of a gather. For example, if you had an NxHxWxC tensor representing an image (e.g., 1x448x448x4), you could index it with a StaticIntTuple[4](0, 100, 100, 0) and simd_load[4] to get the 4 channels at batch=0, x=100, y=100 location in the image. The documentation could probably be clearer about this.

Did you find this page helpful?