Garbage Collection Questions
Hello!
I understand that you're studying garbage collection in C#.
When a program runs, CLR allocates memory, and it collects garbage by dividing it into generations.
However, you're unsure about the criteria for initially allocating a large memory space and how memory is divided into generations.
8 Replies
Generations are fallback, gc tries to collect, but if an object still has a reference it'll move from gen0 to gen1 then to gen2
There is something called the large object heap, loh
Certain objects, certain collections to straight to the loh
There are whole books written on gc
The basic assumption is that most of the allocations would be short lived (e.g. string that gets discarded immediately after use) and long lived objects are likely to live for the whole duration of the app (e.g. static data that gets initialised during startup). With that assumption, it would be much more profitable to focus on collecting newly created objects rather than analysing everything (since GC itself comes with the cost of running the thing in the CPU)
so when objects get created, they are considered a generation 0 object
Gen0 collections are much more frequent because of the reason said above
then, once an object survives a Gen0 collection, it is promoted to Gen1, where it is less likely to be subject to GC passes (i.e., less pointless scannings)
if it even survives a Gen1 collection then it gets promoted to Gen2 (which in .NET is the final one)
I'm not too sure about how it allocates segments and move data around but from my understanding it is dynamic - as in, there is no fixed "gen 1 segment" and "gen 0 segment" from the beginning, but rather, as the GC requests new segments from the OS it would start out as a gen0 segment but as the objects contained in the segments gets promoted or collected it turns it into a Gen1/2 segment
there also would be compaction to reduce fragmentation
as mentioned above there's also a thing called 'large object heap' which is where... large objects go
I believe it is an implementation detail but currently the cutoff for LOH is 85000 bytes or something
this is there mostly because compaction can take a long time when the object is huge
Yeah any object over 85k is straight to loh, collections are more complex
Thank you for your response! I understand the information you provided, but I'm curious about how the Managed Heap is initially allocated. Is there any way to find out about this?
it is obtained from the OS with
VirtualAlloc
(on windows that would be, at least)Thank you for your answer. I should study more
What you asked are the internals of CLR/GC, so only advanced books covering such. CLR via C# is the best I can think of, and no one really want to rephrase the complex paragraphs. Microsoft CLR team also have "The Book of the Runtime" https://github.com/dotnet/runtime/tree/main/docs/design/coreclr/botr but that's another tough book to digest. My personal experience is that it will be slightly easier to understand if you know how to analyze the objects using a memory profiler or WinDbg.
Thank you. I'll make sure to read it whenever I'm bored.
It was very helpful.