❔ Benchmarking object size difference
I've got a small PR here to move a few
bool
fields into an existing "status" int
. My understanding is that this would remove 16 bytes from the object size (1 byte padded to 8 bytes, twice). However when I create a small benchmark that just allocates a new Ping
object, it retains the same size. Additionally, I'm not sure where the 184 bytes from the benchmark are coming from.
I'm sure I'm missing something elementary, does someone know what I'm overlooking here? https://github.com/dotnet/runtime/pull/94151GitHub
Re-use status flag inside Ping by Vannevelj · Pull Request #94151 ·...
Similar idea as #81251 and a couple of other examples described in the blog post. Instead of storing these flags in separate booleans, re-use the existing mechanism that stores them as separate val...
29 Replies
your best bet, I think, is to start looking at generated IL.
if we trim down the
Ping
class to just instance state, we get....
and after changes....
now, we can go plug those into sharplab.io
then you can find someone to help read that, cause I have no idea how to 😆
I CAN do a diff over the two sets of output IL, and see that they're identical except for the two .field
definitions
my guess would be that the optimizer is already doing effectively what you're doing, and byte-packing these fieldsDo you know of any way to confirm that? I was hoping there would be some sort of tool that allows me to tell what an object's structure looks like but I haven't come across something like that yet
You can check in a debugger like windbg.
I think the deal with byte alignment is that whole OBJECTs are 32- or 64-bit-aligned, not fields
so, in the exaple you mentioned, if
HttpRequestMessage
previously had 33 bytes worth of state, then it would get bumped to 40 for alignmentIn this case both objects are 80 bytes
You can look at the actual layout in memory here
oooooooooooooh
that was quick
so, clearly there are fewer instance fields
also weird that the addresses are out-of-order?
shall we arrange that?
or am I completely misunderstanding?
MT is the method table for the type
sure, I was referring to the
Field
column
is that not addresses for those field values?
is there a set of docs for this?Address of the field should be object address + offset I think.
curious what
Field
is then
alright, so.....
well
oh, okay, the offset 98 is a static fieldint32
is 4 bytes, I think default packing is 8 bytes. It looks like the two booleans (offsets 44 and 45) are placed right after the int32
(offset 40). Is the conclusion that int32
+ bool
+ bool
+ 2byte padding is the same as int32
+ 4 byte padding?that's also weird
alright, so if that's accurate, before.....
If you want to know all the details about padding and alignment you can ask someone in #allow-unsafe-blocks
As far as I recall it uses the size of the largest field to determine alignment and pads so the object aligns on a boundary of that size.
But not really my area of expertise.
yeah, I think it's what I thought...
I think you're misunderstanding, thinking that it's FIELDS that are padded
it's actually the whole object
if we look at the after....
notice in the before version, the object has no issue referring to bool fields as single-byte values
_disposeRequested
and _canceled
are only offset by 1 byte
the object as a whole was previously allocating 80 bytes, for 70 bytes worth of fields
removing 2 bytes doesn't drop it down to the next-lower padding boundary
you'd have to drop a whole 6 bytes off the object to get it down to the next boundary of 64 bytesI was with you until you changed the padding to
10
and 12
respectively. Why isn't it 2
and 4
to reach the 48-byte mark?I'm basing that on the fact that the object reports as 80 bytes in size
that would be offset 0x50
to go from 0x45 to 0x50, there must be 10 bytes of padding
I could still be off here
BTW if you want to know how i got that data it was:
Then build Release mode (
dotnet build --configuration Release
)
Then:
Then in winbg hit g
so it runs and hits the breakpoint.
Then:
Then in windbg you can just click the addresses (they're hyperlinks) to dump out the objects for those types, and then from there you can dump out the object instances (click the address) to get the actual memory layout.
Very useful for investigating this kind of thing.I was absolutely going to ask that, thank you
seems like I wasn't too far off
in my approach
I'm surprised the private fields don't get optimized away
I'm guessing that's what the console prints are for, with regard to the local variables?
Yes, that's just to ensure the optimizer doesn't throw away the instances entirely.
Also to ensure they don't get GC'd before the debugger break.
yeah
NOW I'm a little curious why Benchmark.NET is reporting 184 Bytes for these allocations
not building in release?
BenchmarkDotNet only runs in release mode
It won't let you use debug
I don't think the memory diagnoser is byte-by-byte accurate.
The documentation says this:
So, it's supposedly 99.5% accurate.
¯\_(ツ)_/¯
https://benchmarkdotnet.org/articles/configs/diagnosers.html#:~:text=MemoryDiagnoser%20is%2099.5%25%20accurate%20about%20allocated%20memory%20when,or%20Job.ShortRun%20%28or%20any%20longer%20job%20than%20it%29.
¯\_(ツ)_/¯
so long as it's consistent
Thanks both, it's been really interesting! Will play around with windbg a bit more; I appreciate your input
Circling back here: I've written up my findings at https://vannevel.net/posts/exploring-object-layouts. The missing link was the Object Header and the Method Table -- both are 8 bytes each and make up the diff with the expected 80 bytes that we were seeing.
I actually accounted for 8 bytes at the beginning of the object
and I would've thought the method table would be on the type definition, not repeated out per-instance
oh, I gotcha
a REFERENCE to the method table
that makes more sense
I guess that's stored at the END of an object?
nevermind
Pretty sure it's stored at the beginning
Sergey Tepliakov
Developer Support
Managed object internals, Part 1. The layout - Developer Support
The layout of a managed object is pretty simple: a managed object contains instance data, a pointer to a meta-data (a.k.a. method table pointer) and a bag of internal information also known as an object header. The first time I’ve read about it,
yeah, that's why I said nevermind
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.