C
C#β€’2mo ago
substitute

new Span vs stackalloc

I'm working on a library that is able to read and write memory from a remote computer and was wondering which of these two is better practice?
public void WriteFloat(uint address, float value)
{
Span<byte> memory = stackalloc byte [sizeof(float)]
BinaryPrimitives.WriteSingleBigEndian(memory, value);
SetMemory(address, memory, out _);
}
public void WriteFloat(uint address, float value)
{
Span<byte> memory = stackalloc byte [sizeof(float)]
BinaryPrimitives.WriteSingleBigEndian(memory, value);
SetMemory(address, memory, out _);
}
public void WriteFloat(uint address, float value)
{
var memory = MemoryMarshal.Cast<float, byte>(new Span<float>(ref value));
BinaryPrimitives.WriteSingleBigEndian(memory, value);
SetMemory(address, memory, out _);
}
public void WriteFloat(uint address, float value)
{
var memory = MemoryMarshal.Cast<float, byte>(new Span<float>(ref value));
BinaryPrimitives.WriteSingleBigEndian(memory, value);
SetMemory(address, memory, out _);
}
I feel as though these would produce extremely if not exactly the same memory layouts, it creates a Span in both.
28 Replies
Kouhai
Kouhaiβ€’2mo ago
Both are fundamentally different, in the first one you're allocating 4 bytes on the stack (because sizeof(float) == 4) and the span basically points to said memory The second is that you're first getting a float* to the parameter value and then reinterpreting it as byte* You should go with the first option It's makes it much much more clear what you're doing
substitute
substituteβ€’2mo ago
Sure, but the memory is free to be clobbered that's why the first feels wasteful, it's an extra stack allocation for stack memory In both of these, the WriteSingleBigEndian produces the same memory if the host system is big endian (writing value to itself)
Kouhai
Kouhaiβ€’2mo ago
Allocating memory on the stack is basically just changing a value of a register, and 4 bytes is pretty much free
substitute
substituteβ€’2mo ago
Sure, but a reinterpret cast is free at least in languages like C++
Kouhai
Kouhaiβ€’2mo ago
In C# it'll be pretty much be free after JIT'ing as well Still I do not get why reinterepet cast a passed in parameter (even in C++) instead of just allocating 4 bytes of memory on the stack This won't impact your performance at all
substitute
substituteβ€’2mo ago
A passed in parameter is already in a register (in x86_64)
Kouhai
Kouhaiβ€’2mo ago
Have you benchmarked it in a hot path and found out that it'll actually effect your performance? πŸ˜… Also a parameter might not be in a register depending on it's size and the other passed in parameters In this case yes, it'll be in a register
substitute
substituteβ€’2mo ago
My main concern is stack size this call is fine other calls dealing with larger segments of memory, like float[256] could overflow the stack on the stackalloc
Kouhai
Kouhaiβ€’2mo ago
I'm kinda confused, 256 floats would be passed in to the method?
substitute
substituteβ€’2mo ago
Not in the float-only method
public void WriteFloats(uint address, Span<float> values)
{
foreach (ref var value in values)
{
BinaryPrimitives.WriteSingleBigEndian(MemoryMarshal.Cast<float, byte>(new Span<float>(ref value)), value);
}
SetMemory(address, MemoryMarshal.Cast<float, byte>(values), out _);
}
public void WriteFloats(uint address, Span<float> values)
{
foreach (ref var value in values)
{
BinaryPrimitives.WriteSingleBigEndian(MemoryMarshal.Cast<float, byte>(new Span<float>(ref value)), value);
}
SetMemory(address, MemoryMarshal.Cast<float, byte>(values), out _);
}
but other methods like taking a block of floats could take in N, and N could be <=> 256 a single stack alloc of float could be used if the data is moved between iterations but that also seems like it would be slower than in-place ops
Kouhai
Kouhaiβ€’2mo ago
That case makes more sense to be reinterpret casted, still I would use some sort of a memory pool and rent from it instead of writing directly to passed-in values Span
substitute
substituteβ€’2mo ago
the underlying api I send the memory to takes byte[] 😒 so I'm trying to avoid any extra overhead before the eventual heap allocation and memcpy
Kouhai
Kouhaiβ€’2mo ago
I mean, you can use ArrayPool to rent byte[] :Thonkers:
substitute
substituteβ€’2mo ago
that's fair, span in this case isn't guaranteed to be clobberable, but that's only a concern if someone is passing in data that they expect to use after for some reason size isn't constant, but I do know that in the usual case max size is 512 MiB and absolute max is 1024 MiB for the remote system no one should be writing all of the memory of the remote system, those are just the amounts it has πŸ˜…
Kouhai
Kouhaiβ€’2mo ago
Right πŸ˜… I personally think benchmarking different options would be the best way to know if these optimizations are worth it or not, also I would suggest asking people in #allow-unsafe-blocks they are much much more knowledgeable about low level stuff
Want results from more Discord servers?
Add your server