C
C#3d ago
Green Leaf

Why lambda is faster than other approaches?

I've been working on some algorithm and discovered that if the Sort function uses a lambda comparer, it is faster than if it uses a comparer type. So I run this test with BenchmarkDotNet and the result really puzzles me. Why is that? https://gist.github.com/laicasaane/d51193a0a4aff7c6df6c1bb89a66bdc9
Gist
Test Sort with lambda, local function, comparer type
Test Sort with lambda, local function, comparer type - Result.log
5 Replies
HtmlCompiler
HtmlCompiler3d ago
i know, if you change the Test_LocalFunc to
data.AsSpan().Sort(static (x,y) => Compare(x,y));
static int Compare(int x, int y) => x.CompareTo(y);
data.AsSpan().Sort(static (x,y) => Compare(x,y));
static int Compare(int x, int y) => x.CompareTo(y);
you'll see comparable time to Test_Lambda
canton7
canton73d ago
IIRC, deletages which point to instance methods are quicker to call than delegates which point to static methods Something to do with a static delegate invocation needing to go via a trampoline? (the lambda gets turned into an instance method, even though it's marked static: https://sharplab.io/#v2:EYLgtghglgdgNAExAagD4AEBMBGAsAKHQGYACLEgYQG8CS6zT0AWEgFQFMBnAFwH0AZCGGAIIACljcA2gF0So7hACUtejXz1N8iIoB0AQU4BlAA4QYYpbqMB7AE7cx6bADYSYgB5wSATyUkAXgA+Eg9dChswMzt2VhsxPyUAblU6AF8CVIYyFg4eARsAYwgAG3QAVgkYaTkFZSz1LXo6g2MzCytbBzEIqIgY5MyNJrJXEklKSOj2Ku5Q7wm/QJCw3um4hMHh9II0oA==)
mikernet
mikernet3d ago
Making your comparer a struct instead of a class isn't helpful, and is actually hurting things. Your comparer benchmark has an unnecessary boxing operation and allocation on every iteration right now. Change it to a sealed class and use the typical singleton comparer pattern instead. Besides that, delegate calls have always been faster than interface calls. The heuristic I use is that a virtual interface call has about twice the overhead of a virtual delegate call or virtual class call, so that is something to consider. Net8+ kind of complicates things in terms of benchmarks reflecting actual real-world performance and figuring out exactly "why" X is benching faster than Y with all the dynamic PGO devirtualization stuff going on as well. There's a lot of JIT magic happening these days that optimizes stuff. You can add the memory diagnoser to BDN to see allocations.
Green Leaf
Green LeafOP3d ago
Well this sounds things are getting complicated once Unity completes their transition to CoreCLR which essentially enables using .NET 8+ in Unity.
mikernet
mikernet2d ago
All the advice above applies regardless. It might just be harder to gauge how much the PGO optimizations actually apply in typical real-world usage of the code. i.e. in real world code you might have more than one implementation of the comparer interface/delegate fighting for dominance in which case the devirt PGO optimizations won't be as effective as they are in your benchmarks where one implementation of the comparer clearly dominates in frequency of calls and thus is the clear choice for prioritizing for devirt for PGO
Want results from more Discord servers?
Add your server