How C# IEnumerable Can Kill Your Site’s Performance

website performance innumerable







When people come up to speed in C#, collection types present options.  And, sometimes, these options confuse.  Should you use C# IEnumerable, IList, or ICollection?  And what’s an IQueryable, anyway?


This type of wondering generally brings you to the official documentation for one of the types.  For example, look at the declaration of the IList<T> interface.


public interface IList<T> : ICollection<T>, IEnumerable<T>, IEnumerable


IList<T> implements IEnumerable<T>, so IEnumerable must be kind of a subset of IList.  Right? And, for the sake of performance, we should probably use only what we need.  A good rule of thumb then would be to see if you can easily use IEnumerable, and then switch to IList when you need to do something IEnumerable fails to do.  Seems reasonable.


Well, it seems reasonable until you really, truly understand both types.  But I’ll come back to that.

A Tale of Performance Woe


I think most people in the industry can relate to differences between behavior in development or test and in production.  In fact, we’ve immortalized the sentiment with a ubiquitous catchphrase.  It works on my machine!


When we think of this phrase, we usually think in terms of behavior.  Everything worked when you ran it locally but then blew up when running on someone else’s machine.  Oops.  Turned out you referenced an environment variable set on your machine but not elsewhere.


But it can also apply to performance concerns.  You code something up, run it against your local database, and all seems fine.  So you ship it to pre-prod or even production.  Only then do you discover some horrifyingly different behavior that you can’t explain with simple database scale?


After some pain and digging, you find yourself learning about something you’d never previously heard of called the N+1 problem.  Apparently, your elegant data access code isn’t quite as elegant as it first seemed.

Digging Into C# IEnumerable, Arrays, and Lists


Let’s now get to the subject of true understanding that I mentioned a moment ago.  I’ll start by explaining some collection types with which you may be more familiar.  Consider, for starters, the humble array.


Arrays date back about as far as programming, and they represent grouped values.  They have declared and fixed capacity, so if you make a 4 element array, it will always store 4 elements.  You can set those elements to different values and you can iterate over the array if you so choose, performing an operation on each element.


As I mentioned, arrays are old.  People have used them forever and, during that time, have bumped up against their limitations.  The property of having fixed length causes annoyance, and people find it convenient to perform quick operations like sorting and filtering duplicates.  So you wind up with heavier weight, more sophisticated types like List<T>, which implements the IList<T> interface.  That interface demands that its implementations support operations such as adding, removing and clearing.


Many implementations of IList will involve arrays at their core, decorating them with convenience functionality.  So you can think of them as convenient, heavyweight arrays.


But then if lists are heavyweight arrays, does that make IEnumerable<T>, with its single “GetEnumerator” method, something like a lighter weight array?  No, it turns out.  Not at all.

The Promise of IEnumerable


When using the C# IEnumerable construct, things get conceptually weird in short order.  At least they do if you’re not used to how this stuff works.


In a sense, arrays and lists are tangible.   You have a bunch of strings or integers sitting there in memory in a row, waiting for you to do things to them.  Easy to work with and easy to reason about.  When it comes to IEnumerable, however, you don’t have anything tangible.  Instead, you have a promise that you’ll have things sitting there in a row later when you really need them.  Perhaps you’ve heard the term lazy loading before (“don’t load until you have to”)?  Well, IEnumerable<T> implements the related concept of deferred execution, which forbids computation until someone uses the result.


Let’s make this more tangible with a simple allegory.  Let’s say that I’m a method, and my job is to return fruit.  When I return a List of fruit to you, you ask me for fruit and I hand it to you in an orderly fashion.  Here’s an apple, here’s an orange, etc.  But when I return an IEnumerable of fruit to you, I hand you something else entirely.  I hand you a note that says, “this note entitles you to some fruit — give me a call when you want to eat the first piece of fruit, and I’ll produce it at that time.”


Back in the programming world, you can think of this as a commitment or promise of sorts.  In reality, all IEnumerable promises you is an underlying strategy for delivering the next element — a state machine if you want to get technical about it.


Back in the World of Databases and Performance


So let’s go back now to the narrative.  You sit at your desk, implementing an MVC app over a database, and everything seems fine.  You follow the rule of thumb I mentioned earlier, using IEnumerable<T> (or IQueryable<T>) when they have sufficient functionality.  Everything goes along swimmingly as you develop and perform your testing.


When you do see some performance issues, they don’t crop up during database operations. Instead, they crop up as you cycle through records and build the pages you plan to return over the wire.  So whatever slowness you notice must come from some part of the web framework itself is slow.  Your database queries return with lightning speed, so all is good!


Except, now you know that isn’t true.  As you step through the call stack down toward the database calls and then back up again, you understand what you’re really getting.  That call that seems to trigger a “SELECT *” and population of the results really just returns an assurance that it will make that call when the time comes.  That method returning an IEnumerable says, “yep, here you go — one note promising that we’ll make a database later when you need the data.”


Keeping Performance Intact


With that knowledge in mind, you can have better antennae for the types of performance problems that seem small in dev, but come up huge in production.  Deferred execution (and lazy loading) offer a nice way to put off potential performance hits until the absolute last minute.  But, in doing that, they force a tradeoff upon you wherein they make execution harder to reason about.  When you make extensive use of these patterns, you can lose track of where the bottlenecks really lie.  So you must stay vigilant.


Also, bear in mind that not all IEnumerable implementations use deferred execution.  After all, IList implements IEnumerable and it doesn’t defer anything.  Deferred execution is just a possibility, depending on which implementation you wind up with.  So keep your eye out for it.  And, speaking more broadly, make sure you understand exactly what IEnumerable is and what you’re signing up for when you use it.  It’s not just a lighter weight list.  This understanding will help you keep your code both performant and correct.

You might also like

  • Phil Vuollet

    IEnumerable in this way is an example of a leaky abstraction! Array implements IEnumerable so you might actually have an Array, but the point is that IEnumerable tells you that you have something you can Enumerate…but you are not meant to Add or Remove from it as in a List.
    When I was heavy into Linq with EF, I made a habit of calling ToArray to eager load before returning from the data reader classes. Another clear danger that came up which REALLY impacted performance was Lazy-Side-Loading…loading a related collection of entities per item in the original set…Students->Classes for example. The big aha moment was when this was happening within Automapper…grind to halt side-loading!
    You have to get behind the abstraction in order to properly use these on the producer side (implementing the method that returns IEnumerable).

  • Naftaly Weinberger

    I am a intermediate developer, and i always get beaten over the head, by a senior guy, admonishing me for me copious use of ToList() rather then leaving it as an IEnumerable. he says ToList() is evil. I think it’s the opposite, not using it is bad.

    • Jim Pedid

      In memory-critical applications, converting things to list is bad because you’re forcing the result set to end up in memory. One principle benefit of using an enumeration pattern is that, given the underlying implementation allows for it, can enumerate over collections that cannot fit in memory. It really comes down to knowing what your implementation is when choosing to convert it to a list or not.

      • Naftaly Weinberger

        Point is, that using ienumerable just delays the fetching of info till the first time the list is enumerated, but when this happens, everything get’s loaded into memory, so i dont really see the harm from ToList. unless you have a front end grid which adds paging code to your sql. from the other side, not using to list can sometimes be funny in behavior.

        From my limited experience, both are right, Tolist() is not good to use everywhere and is not the worst either

        • john_way

          If you dig deeper into IEnumerable and the various ways of implementations; working with a correct understanding thereof, you will still end up saving on both performance and overhead.

          I’m not saying ToList() is bad, but you have to however take into account if you are processing 10000 records, and you only intend to update 1000 instances of those records, then you’ve already exceeded your performance target by 90% additional overhead.

          IEnumerable provides you the flexibility of stopping enumerations if certain criteria is met, i.e. if you need to process Bank Batch Payments, and it fails on the 6th record … then you can exit out before even converting the remainder 9994 records and inform the End User of said failures.

          With proper mappings, proper handling, proper use of yield, and many other factors taken into account you can greatly improve the performance and overhead of what is returned, and working with exactly what you need.

          I’ve seen Developers make use of EF using eager loading, populating referenced objects in a complete DataSet with over 250 tables, only to updated 2 records on the master table; now running that query with thousands and thousands of records took more than 2 minutes on a decent spec machine, while disabling eager loading and stepping away from ToList() executed in less than a second.

          I tend to agree with the Author in the sense that if you choose to use IEnumerable, you have to ensure that you understand exactly what it is, and what it does … otherwise you could also end up enumerating the entire set, which would have been exactly the same as calling ToList().

          As Jim has also replied, it all depends on your implementation thereof.

    • Andrew Stanton

      Using ToList() more than once on the same IEnumerable is bad. It creates multiple in memory List objects.

      Multiple enumeration of an IEnumerable is as bad if not worse. One iteration of the items may produce different results from another iteration, or the iteration of each item may be dynamically creating items, etc.

      The earlier you can turn IEnumerable into IList or List, the sooner you can avoid both of those problems. Its not a problem to do either Lists (or ToArray()) in the general case, but Entity Framework will be forced to execute the query and load the records into memory when calling .ToList(). This isnt bad either as this was going to happen anyway, its just not as efficient as can be possible.