Improving .NET Application Performance Part 14: Collections

In our previous article in this series we discussed array optimization in .NET. In this article we’ll focus on working with collections.  Of course there are the two basic types of collections: lists and dictionaries, where lists are index-based and dictionaries are key-based. Collections implement the IEnumerableICollection, or IList interfaces. If the types implement IDictionary separately or in addition to these three interfaces, they are referred to as Dictionaries.

Performance issues with Collections

In this section we’ll point out the performance-related issues associated with collections.




Boxing Issues

If you use a collection such as an ArrayList to store value types such as integer or float, every item is boxed (a reference type is created and the value copied) when it is added to the collection. If you are adding many items to the collection, the overhead can be excessive. The problem is illustrated by the following code snippet.

ArrayList al = new ArrayList();

for (int i=0; i<1000;i++)

  al.Add(i); //Implicitly boxed because Add() takes an object

int f = (int)al[0]; // The element is unboxed

To prevent this problem, consider using an array instead, or creating a custom collection class for your specific value type.

Thread Safety

Collections are generally not thread safe by default. It is safe for multiple threads to read the collection, but any modification to the collection produces undefined results for all threads that access the collection. To make a collection thread safe, do the following:

  • Create a thread safe wrapper using theSynchronized method, and access the collection exclusively through that wrapper.

// Creates and initializes a new ArrayList.

ArrayList myAr = new ArrayList();   


// add objects to the collection   

// Creates a synchronized wrapper around the ArrayList.

ArrayList mySyncdAr = ArrayList.Synchronized( myAr );


Use the lock statement in C# (or SyncLock in Visual Basic .NET) on the SyncRoot property when accessing the collection.

ArrayList myCollection = new ArrayList();

lock( myCollection.SyncRoot ) {

  // Insert your code here.


You can also implement a synchronized version of the collection by deriving from the collection and implementing a synchronized method using the SyncRoot property.

Enumeration Overhead

The .NET Framework collections provide an enumerator by overriding IEnumerable.GetEnumerator, which is less than optimal for a number of reasons:

  • TheGetEnumerator method is virtual, so the call cannot be inlined.
  • The return value is anIEnumerator interface instead of an exact type; as a result, the exact enumerator cannot be known at compile time.
  • TheMoveNext method and Current properties are again virtual and so cannot be inlined.
  • Currentrequires a return type of System.Object, rather than a more specific data type which may require boxing and unboxing, depending on the data types stored in the collection.

As a result, there are both managed heap and virtual function overhead associated with foreach on simple collection types. This can be a significant factor in performance-sensitive regions of your application. We’ll talk about enumeration overhead further into this article.

Collection Guidelines

This section summarizes guidelines that help you to use .NET Framework collection types most efficiently and to avoid common performance mistakes:

Analyze Your Requirements Before Choosing the Collection Type

Try to determine if you really need to use a collection. Arrays are usually more efficient, especially if you need to store value types. You can use the following criteria when determining which collection is appropriate:

  • Do you need to sort your collection?
  • Do you need to search your collection?
  • Do you need to access each element by index?
  • Do you need a custom collection?

Do You Need to Sort Your Collection?

If you need to sort your collection, do the following:

  • UseArrayList to bind the read-only sorted data to a data grid as a data source. This is better than using a SortedList if you only need to bind read-only data using the indexes in the ArrayList (for example, because the data needs to be displayed in a read-only data grid). The data is retrieved in an ArrayList and sorted for displaying.
  • UseSortedList for sorting data that is mostly static and needs to be updated only infrequently.
  • UseNameValueCollection for sorting strings.
  • SortedListpresorts the data while constructing the collection. This results in a comparatively expensive creation process for the sorted list, but all updates to the existing data and any small additions to the list are automatically and efficiently resorted as the changes are made. Sortedlist is suitable for mostly static data with minor updates.

Do You Need to Search Your Collection?

If you need to search your collection, do the following:

  • UseHashtable if you search your collection randomly based on a key/value pair.
  • UseStringDictionary for random searches on string data.
  • UseListDictionary for sizes less than 10.

Do You Need to Access Each Element by Index?

If you need to access each element by index, do the following:

  • UseArrayList and StringCollection for zero-based index access to the data.
  • UseHashtableSortedListListDictionary, and StringDictionary to access elements by specifying the name of the key.
  • UseNameValueCollection to access elements, either by using a zero-based index or specifying the key of the element.
  • Remember that arrays do this better than any other collection type.

Do You Need a Custom Collection?

Consider developing a custom collection to address the following scenarios:

  • Develop your own custom collection if you need to marshal by reference because all standard collections are passed by value. For example, if the collection stores objects that are relevant only on the server, you might want to marshal the collection by ref rather than by value.
  • You need to create a strongly typed collection for your own custom object to avoid the costs of upcasting or downcasting, or both. Note that if you create a strongly typed collection by inheritingCollectionBase or Hashtable, you still end up paying the price of casting, because internally, the elements are stored as objects.
  • You need a read-only collection.
  • You need to have your own custom serializing behavior for your strongly typed collection. For example, if you extendHashtable and are storing objects that implement IDeserializationCallback, you need to customize serialization to factor for the computation of hash values during the serialization process.
  • You need to reduce the cost of enumeration.

Initialize Collections to the Right Size When You Can

Initialize collections to the right size if you know exactly, or even approximately, how many items you want to store in your collection; most collection types let you specify the size with the constructor, as shown in the following example.

ArrayList ar = new ArrayList (43);

Even if the collection is able to be dynamically resized, it is more efficient to allocate the collection with the correct or approximate initial capacity (based on your tests).

Consider Enumerating Overhead

A collection supports enumeration of its elements using the foreach construct by implementing IEnumerable.

To reduce the enumeration overhead in collections, consider implementing the Enumerator pattern as follows:

  • If you implement IEnumerable.GetEnumerator also implement a non-virtual GetEnumerator method. Your class’sGetEnumerator method should call this nonvirtual method, which should return a nested public enumerator struct as shown in the following code sample.

class MyClass : IEnumerable


  // non-virtual implementation for your custom collection

  public MyEnumerator GetEnumerator() {

    return new MyEnumerator(this); // Return nested public struct


  // IEnumerator implementation

  public IEnumerator.GetEnumerator() {

    return GetEnumerator();//call the non-interface method



The foreach language construct calls your class’s nonvirtual GetEnumerator if your class explicitly provides this method. Otherwise, it calls IEnumerable.GetEnumerator if your class inherits from IEnumerable. Calling the nonvirtual method is slightly more efficient than calling the virtual method through the interface.

  • Explicitly implement the IEnumerator.Current property on the enumerator struct. The implementation of .NET collections causes the property to return aObject rather than a strongly typed object; this incurs a casting overhead. You can avoid this overhead by returning a strongly typed object or the exact value type rather than System.Object in your Current property. Because you have explicitly implemented a non-virtual GetEnumerator method (not the IEnumerable.GetEnumerator) the runtime can directly call the Enumerator.Current property instead of calling the IEnumerator.Current property, thereby obtaining the desired data directly and avoiding the casting or boxing overhead, eliminating virtual function calls, and enabling inlining.

Your implementation should be similar to the following.

// Custom property in your class

//call this property to avoid the boxing or casting overhead

Public MyValueType Current {

  MyValueType obj = new MyValueType();

  // the obj fields are populated here

  return obj;


// Explicit member implementation

Object IEnumerator.Current {

get { return Current} // Call the non-interface property to avoid casting


Implementing the Enumerator pattern involves having an extra public type (the enumerator) and several extra public methods that are really there only for infrastructure reasons. These types add to the perceived complexity of the API and must be documented, tested, versioned, and so on. As a result, you should adopt this pattern only where performance is paramount.

The following sample code illustrates the pattern.

public class  ItemTypeCollection: IEnumerable


   public struct MyEnumerator : IEnumerator


      public ItemType Current { get { } }

      object IEnumerator.Current { get { return Current; } }

      public bool MoveNext() { }


   public MyEnumerator GetEnumerator() { }

   IEnumerator IEnumerable.GetEnumerator() { }


To take advantage of JIT inlining, avoid using virtual members in your collection unless you really need extensibility. Also, limit the code in the Current property to returning the current value to enable inlining, or alternatively, use a field.

Prefer to Implement IEnumerable with Optimistic Concurrency

There are two legitimate ways to implement the IEnumerable interface. With the optimistic concurrency approach, you assume that the collection will not be modified while it is being enumerated. If it is modified, you throw an InvalidOperationException. An alternate pessimistic approach is to take a snapshot of the collection in the enumerator to isolate the enumerator from changes in the underlying collection. In most general cases, the optimistic concurrency model provides better performance.

Consider Boxing Overhead

When storing value types in a collection, you should consider the overhead involved, because the boxing overhead can be excessive depending on the size of the collection and the rate of updating or accessing the data. If you do not need the functionality provided by collections, consider using arrays to avoid the boxing overhead.

Consider for Instead of foreach

Use for instead of foreach (C#) to iterate the contents of arrays or collections in performance critical code, particularly if you do not need the protections offered by foreach.

Both foreach in C# and For Each in Visual Basic .NET use an enumerator to provide enhanced navigation through arrays and collections. As discussed earlier, typical implementations of enumerators, such as those provided by the .NET Framework, will have managed heap and virtual function overhead associated with their use.

If you can use the for statement to iterate over your collection, consider doing so in performance sensitive code to avoid that overhead.

Implement Strongly Typed Collections to Prevent Casting Overhead

Implement strongly typed collections to prevent upcasting or downcasting overhead. Do so by having its methods accept or return specific types instead of the generic object type. StringCollection and StringDictionary are examples of strongly typed collections for strings.

Be Efficient with Data in Collections

When dealing with very large numbers of objects, it becomes very important to manage the size of each object. For example, it makes little difference whether you use a short (Int16), int/Integer (Int32), or long (Int64) for a single variable, but it can make a huge difference if you have a million of them in a collection or array. Whether you are dealing with primitive types or complex user-defined objects, make sure you do not allocate more memory than you need if you will be creating a large number of these objects.

Collection Types

Let’s summarize the main issues that you should consider when using the different collection types.


The ArrayList class represents a list that dynamically resizes as new items are added to the list and its current capacity is exceeded. Consider the following recommendations when using an ArrayList:

  • UseArrayList to store custom object types and particularly when the data changes frequently and you perform frequent insert and delete operations.
  • UseTrimToSize after you achieve a desired size (and there are no further insertions expected) to trim the array list to an exact size. This also optimizes memory use. However, be aware that if your program subsequently needs to insert new elements, the insertion process is now slower because the ArrayList must now dynamically grow; trimming leaves no room for growth.
  • Store presorted data and useBinarySearch for efficient searches. Sorting and linear searches using Contains are expensive. This is essentially for one-off sorting of data, but if you need to perform frequent sorting, a SortedList might be more beneficial because it automatically re-sorts the entire collection after each insertion or update.
  • AvoidArrayList for storing strings. Use a StringCollection


Hashtable represents a collection of key/value pairs that are organized based on the hash code of the key. Consider the following recommendations when using Hashtable:

  • Hashtableis suitable for large number of records and data that may or may not change frequently. Frequently changing data has an extra overhead of computing the hash value as compared to data which does not change frequently.
  • UseHashtable for frequently queried data; for example, product catalogues where a product ID is the key.


HybridDictionary is implemented internally, using either a ListDictionary when the collection is small or a Hashtable when the collection increases in size. Consider the following recommendations:

  • UseHybridDictionary for storing data when the number of records is expected to be low most of the time, with occasional increases in size. If you are sure that the size of collection will be always high or always low, you should choose Hashtable and ListDictionary  This avoids the extra cost of the HybridDictionary, which acts as a wrapper around both these collections.
  • UseHybridDictionary for frequently queried data.
  • Do not useHybridDictionary to sort data. It is not optimized for sorting.


Use ListDictionary to store small amounts of data (fewer than 10 items).This implements the IDictionary interface using a singly-linked list implementation. For example, a factory class that implements the Factory pattern might store instantiated objects in a cache using a ListDictionary, so they can be served directly from the cache the next time a creation request is made.


This represents a sorted collection of associated string keys and string values that can be accessed either with the key or with the index. For example, you may use a NameValueCollection if you need to display subjects registered by students in a particular class because it can store the data in alphabetical order of student names.

  • UseNameValueCollection to store strings of key/value pairs in a pre-sorted order. Note that you can also have multiple entries with the same key.
  • UseNameValueCollection for frequently changing data where you need to insert and delete items regularly.
  • UseNameValueCollection when you need to cache items and for fast retrieval.


Queue represents a first-in, first-out object collection. Consider the following recommendations for using Queue:

  • UseQueue when you need to access data sequentially, based on priority. For example, an application that scans the waiting list of plane reservation requests and gives priority by allocating vacant seats to passengers at the beginning of queue.
  • UseQueue when you need to process items sequentially in a first-in, first-out manner.
  • If you need to access items based on a string identifier, use aNameValueCollection


The SortedList represents a collection of key/value pairs that are sorted by the keys and are accessible by key and by index. New items are added in sorted order and the positions of existing items are adjusted to accommodate the new items. The creation costs associated with a SortedList are relatively high, so you should use it in the following situations:

  • The collection can be used where the data is mostly static and only a few records need to be added or updated over a period of time; for example, a cache of employee information. This can be updated by adding a new key based on employee number, which is added quickly in theSortedList, whereas an ArrayList needs to run the Sorting algorithm all over again so the delta change is faster in SortedList.
  • UseSortedList for fast object retrieval using an index or key. It is well suited for circumstances where you need to retrieve a set of sorted objects, or for querying for a specific object.
  • Avoid usingSortedList for large data changes because the cost of inserting the large amount of data is high. Instead, prefer an ArrayList and sort it by calling the Sort  The ArrayList uses the QuickSort algorithm by default. The time taken by ArrayList is much less for creating and sorting than the time taken by the SortedList.
  • Avoid usingSortedList for storing strings because of the casting overhead. Use a StringCollection


This represents a simple last-in, first-out object collection. Consider the following recommendations for using a Stack:

  • UseStack in scenarios where you need to process items in a last–in, first-out manner. For example, an application that needs to monitor the 10 most recent users visiting a Web site over a period of time.
  • Specify the initial capacity if you know the size.
  • UseStack where you can discard the items after processing it.
  • UseStack where you do not need to access arbitrary items in the collection.


This represents a collection of strings and is a strongly typed ArrayList. Consider the following recommendations for using StringCollection:

  • UseStringCollection to store string data that changes frequently and needs to be retrieved in large chunks.
  • UseStringCollection for binding string data to a data grid. This avoids the cost of downcasting it to a string during retrieval.
  • Do not useStringCollection for sorting strings or to store presorted data.


This is a Hashtable with the key strongly typed as a string, rather than an object. Consider the following recommendations for using StringDictionary:

  • UseStringDictionary when the data does not change frequently because the underlying structure is a Hashtable used for storing strongly typed strings.
  • UseStringDictionary to store static strings that need to be frequently queried.
  • Always preferStringDictionary over Hashtable for storing string key/value pairs if you want to preserve the string type to ensure type safety.