โ What is IEnumerable<out T> ?
Hi! I have written this code which works, but I dont understand what type of value 'var myQuery' actually is. Is it an array containing
the filtered values?
// Data source.
int[] numbers = new int[] { 34, 57, 89, 12, 4, 6, 7, 123 };
// Data query. <---MY QUESTION
var myQuery = numbers.Where(number => number < 10);
// Data Execution.
foreach (var number in myQuery)
{
Console.WriteLine(number);
}
95 Replies
no. its a lazy evaluated "sequence"
meaning when you do
numbers.Where(...)
its not actually looping over your list and calculating anythingits just remembering that "oh okay, I should only take the items that match this criteria"
ok! because I found this online, and from it I understood it as being an array created, saved in the var variable. And the variable type will change according to what is retrieved.
so, thats.. uh.. not correct.
and also using query syntax, which is a warcrime
why?
do you honestly think
is more readable than
?
it gets even worse when you add in even more linq methods
but thats a side issue, just be aware that most people greatly prefer the method syntax
ah you mean the version written in the example? yeah I wasnt looking at it very much, I just read the part about it being an array
okay. well, its not.
its a sequence of elements, but its not an array
and the important part is that its lazy
but what is the type then?
IEnumerable
int[] numbers = new int[] { 34, 57, 89, 12, 4, 6, 7, 123 };
// Data query.
var myQuery = numbers.Where(number => number < 10);
// Data Execution.
foreach (var number in myQuery)
{
Console.WriteLine(number);
}
dont care about what that is under the hood, its not important
this is what I wrote, what exactly is wrong with it?
nothing
nothing is wrong with it.
but
myQuery
is not an array.ah okay
I dont really understand IEnumerable, been reading about it online for a while
its a lazy sequence
thats really all it is
but if its lazy, is there still nothing wrong with the code I wrote?
because lazy sounds bad
oh god no
its not lazy as in bad
its lazy as in "lazily evaluated"
ok thanks! not sure if you meant lazy the good way or the bad way
๐
no its evaluated on demand
let me demonstrate
please do!
oh silly me
Pobiega
REPL Result: Success
Console Output
Compile: 742.819ms | Execution: 152.338ms | React with โ to remove this embed.
so, we see that it goes through all the numbers. that makes sense.
but let me demonstrate WHEN it does it
Im all ears!
Pobiega
REPL Result: Success
Console Output
Compile: 704.779ms | Execution: 164.711ms | React with โ to remove this embed.
look at that.
so we create our query, print that its done, then we call
myQuery.Max()
to find the largest number
and only then are the numbers evaluated to see if they are below 10you mean - nothing is actually exectuted until myQuery.Max(); is called?
yes!
thats what lazy evaluation means. it runs when needed, not before
so hence being "lazy" it wont move until it actually HAS to? ๐คฃ
exactly
ah thank you
however, this has a sideeffect
which is?
what do you think the output will be here?
probably only the smallestNumber?
Pobiega
REPL Result: Success
Console Output
Compile: 801.138ms | Execution: 108.172ms | React with โ to remove this embed.
it evaluates the
Where
condition twiceand the testing lines of course
this is what we call "multiple enumeration"
and thats bad, I suppose?
could be
what if this list had... 1000000 items? and instead of checking that its below 10, we check if some database has that item
bit of a contrived example, but we obviously don't want to evaluate the condition twice if we dont need to
so how do we fix that?
we can "realize" the query when we know we are not going to be modifying it further
Pobiega
REPL Result: Success
Console Output
Compile: 786.596ms | Execution: 105.441ms | React with โ to remove this embed.
here, I added a
ToArray()
call at the end of the query
this forces it to enumerate and stores the result as an arrayI was justing going to ask you how all this compares to an array
*just
what does "enumerate" actually mean? I sorta understand it but I couldnยดt really explain it ... so I dont think I do lol
like counting things manually, one by one?
its a bit contextual, but "looping over the source" is one way to think of it
lets take Min and Max as an example
to find the smallest number in a sequence, we need to check every number in the sequence right?
yes
and the same goes for biggest, ye?
ye
so when I do
we are actually going over the sequence twice
true
once for max, once for min
now here is the kicker
If you KNOW you only care about numbers below 10
but the list has... 1000000 items
its probably better to go over the list once, filter out only values below 10, save that, then let min and max go over that result
this results in a much smaller total amount of iterations than accessing the source list twice
ah so hence why you use the array to do the first filtering?
yes!
this means that its no longer an IEnumerable
you can see that Rider lets us know that
myQuery
is actually int[]
nowbecause, for some reason, when we go from an IEnumerable to array, its not as heavy to do ?
no, because that terminates the "lazy"
where is lazy, remember?
so when we call .ToArray(), it runs on all the items and we store the result
ah ! because if we dont make it an array... it will loop the 1000000 items twice?
yes
I get it I think! but the array must also loop everything 10000 etc times ?
yes, but thats unavoidable
if you want every single item below 10 in a list of 10000000 items, you must check every item
but its bad to do that checking every time
also, methods like Where can be chained
Pobiega
REPL Result: Success
Console Output
Compile: 721.638ms | Execution: 156.012ms | React with โ to remove this embed.
look at those results. pretty interesting order if you ask me
it checks 12, but 12 is not below 10 so its discarded
then it checks 4, and thats kept
and then we check if 4 is above 0
so it evaluates the entire chain, per item, if possible
so if I summarize it now... if we dont force the .ToArray, then the myQuery.Max and myQuery.Min will have to work with the complete 10000+ data to filter. But, if we use ToArray, we will store all of the interesting values in an array. And only this array will be evaluated.
I mean, the numbers stored in the array will be evaluated?
wdym?
by Min and Max? yes
because they are now being called on the array, not an IEnumerable or the original source
we do the .ToArray so that min and max can work with the <10 numbers directly
yes, at that point, "myQuery" is now an array and has no connection to the original numbers
its a new array in memory
but how is an array structurally connected to the IEnumerable?
because I mean, they both deal with storing a sequence
Pobiega
REPL Result: Failure
Exception: InvalidOperationException
Compile: 777.210ms | Execution: 111.545ms | React with โ to remove this embed.
so they should be quite connected
lets rewind a bit
when you make an IEnumerable like
myQuery
, by running for example .Where on numbers
its very much still connected to the array
actually, lets demonstrate thatPobiega
REPL Result: Success
Console Output
Compile: 656.721ms | Execution: 108.438ms | React with โ to remove this embed.
so we make a list with 100 in it
then we create the enumerable
after that we add 4 new numbers
then we loop over the query
and it prints 1 and 5, both numbers added AFTER the enumerable was created
when you say "the enumerable" you mean the myQuery in this case?
yes
but a list is not 100% like an array, or?
true. its a wrapper around an array to let it grow as needed
arrays are fixed size, lists are not
but thats not the important bit ๐
but in this example, it would work the same either with an array or list?
yes, except that arrays dont have an "AddRange" or "Add" method
but if we manually added them to an oversized array, yes
I just wanted to demonstrate that the datasource matters, but not the values in it
at the time of the creation of the enumerable, that is ๐
This part in the bottom, var myQuery = ....
yup
is that where we create a enumerable "version" of the list ? we use the list to create a "lazy sequence"? but we can still modify the original list and the query will still work, even if we add or remove numbers?
yes
is that where we create a enumerable "version" of the listthis isnt true, but the rest is we just create an "IEnumerable" that uses the list as its source. the list is untouched
yes, I mean the list remains as it was originally. But its just as a source to create an Ienumerable.
yes
used as a source
okay! I need to thank you very much for taking the time, this all feels a lot clearer to me now ๐
yw
its a pretty important concept in .NET so its worth the effort to understand ๐
you can even make your own enumerables ๐
this is a valid method:
it returns an enumerable that will contain 3 numbers (7,7,6). if you call
First
on it, it returns 7. If you call Last, 6.
you can even do some very funky stuff with this
dont loop over this enumerable ๐
but feel free to do .Take(10)
on it ๐
and it would get the first 10 numbersAlso, as a slight addendum, you can do this:
Thinker
REPL Result: Success
Console Output
Compile: 729.959ms | Execution: 101.298ms | React with โ to remove this embed.
This demonstrates that what
odd
here is actually an object which contains a reference to the list xs
, because you can add new items to the list after calling Where
but which will still be considered the enumerable returned by Where
.
What Where
actually returns is something like this:
And Where
looks like