C# IEnumerable ToArray() Benchmark
I was benchmarking different kinds of IEnumerable Methods out of curiosity and found something very strange: Calling ToArray() on a temporary array is relatively slow.
I am aware, that array.ToArray() creates a copy of array, but I would have thought that the compiler was smart enough to ommit the copy if array is an temporary object. Am I missing something?
39 Replies
no, ToArray() explicitly makes a copy of the collection as an array
So is the compiler really that bad even with Release Configuration? Is there any possibility to enable more Optimization?
what do you mean bad? you're expecting it to do something it shouldn't do
The Same code in C++ generates two identical assemblies..
it's not the same code because it's a completely different language
Just for completness. This is the function that is called inbetween. As you can clearly seen, a temporary object is created that is never accessed again
Ok my bad, I meant equivalent code
From the compilers point of view, how is it supposed to know that callers of
ConversionTest2
are not expecting a new copy?It is indeed expecting a new copy! But it gets a new copy either way since ConvertToPointEnumerable() already returns a copy
you're expecting more optimization from compiling to IL than makes sense
i think reflectronic is around so i'll wait for him to chime in :when:
how does it know that ConvertToPointEnumerable returns a copy?
Like this is the sort of optimization the JIT might be able to make if everything gets inlined, but there isn't any gurantee that it happens
from an outside PoV you asked it to make two copies, so it shouldn't be surprising that it makes two copies.
the simple solution here is "don't make unnecessary copies by calling ToArray/List/etc"
I agree, its not trivial, but a C++ compiler would have no problem to see that the object is temporary and omit the copy. On the other hand C++ compilers far slower, so I guess its a reasonable tradeoff
the C++ compiler couldn't make this optimization unless the methods likewise got inlined.
Yes, but it would just inline it, since it sees the static method
you could ask #allow-unsafe-blocks , i don't know enough about the JIT internals to be that useful other than to tell you not to write inefficient code and hope the compiler fixes it 😛
Yeah... In this case it's easy, but sometimes doing things in Place is a little bit more complicated. I would have hoped that the Compiler does more heavy lifting, but I guess I have to do it myself...
So there is one thing I just found out: the compiler can omit copies in some cases. Chaining multiple ToArray() statements makes the code very slow. array.ToArray().ToArray().ToArray().ToArray() is much slower than array.ToArray() but interestingly ToImmutableArray() doesnt get slower no matter how often you chain it.
The first call does copy the data, but the chained calls wont.
because it checks if it's an immutable array first
it makes sense to omit copies for an immutable array because it can't be modified
Yes, just like it makes sense to omit a copy for a temporary object... But one is easier to check at compile time than the other
none of this is being done at compile time in either case
that's a runtime check in the code i shared
Good point, ill see if the JIT compiler removes the check
The just will not remove the check. It would have to know somehow that the array isn't being held onto anywhere (aliased)
The there could be impls that, for example, stored the last few arrays into static variables somewhere. And this difference would be observable.
I was thinking about this. But in this case, if say specifically all the
ToArray
calls were inlined, shoudn't it be able to see that the temporary copies never escape the scope and omit them?No. Because it would have to know how the array type itself worked.
i was gonna say, arrays aren't just a chunk of memory like C style arays are
well... specifically for the special case of array here, couldn't it? This isn't any arbitrary object...
there is no special case, this method takes an IEnumerable<T>
ah yeah didn't consider that
it doesn't know it's an array at all
It's not just ToArray, it's everything involved. It would need to know that there was no aliasingg at all and that this was a fresh copy itself, and that the copy being requested was exactly the same, (including variance) etc.
The runtime would need to cheaply be able to track aliasing somehow.
It would be great if functions could specify that they return an unaliased object, than the JIT would know
well my thinking was if everything got inlined to essentially...
And so on. Those could eluded. But I guess ToArray taking an IEnumerable prevents that.
the implementation specializes for IIListProvider<T> and ICollection<T>, this hits the latter afaik which calls the collection's CopyTo method
yeah Array's obviously implement ICollection so I guess in theory something like this should be in-lineable.
I dont quite understand this, can you elaborate?
so... like Jimmacle pointed out, ToArray operates on an enumerable. What if my implementation of the enumerable did something like...
that is, every time the object is enumerated, some other object gets modified.
In general this would be quite hard, but aren't there cases where this would be easy? Like for some return objects? For simple functions, its quite easy to guarantee that you return an unaliased object and this could be inferred automatically for simple cases like my example.
For cases where you create an object using another method, you would need to know that the other method creates an unaliased object and so on. That wouldn't be cheap to track. But couldn't you manually provide a keyword at compile time just like const functions in c++?
That.... looks scary. I wish there was a way to guarantee no side effects in C# like in C++
it's not easy, as yo uhave to know precisely how every operation works. you can't make any assumptions her.e remember that you might run on any runtime, with any impl of any type that doesn't whatever it wants.