When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML
Information:
Advertise
Feedback
Author an Article

ASP ASP.NET ASP FAQs Message Board Feedback
 
Print this Page!
Published: Wednesday, June 22, 2005

Specialized Collections

By Scott Mitchell


Introduction


The .NET Framework 1.x contains a variety of built-in collection types that can be found in the System.Collections namespace. These built-in collections include the commonly used ArrayList and Hashtable collections, along with the CollectionBase class which serves as a great class to extend to build strongly-typed collections. There are also a variety of other collection classes in this namespace, such as: the Stack and Queue, which are essentially ArrayLists underneath the covers, but place restrictions on the order with which elements can be accessed from the collection; there's also a BitArray class for efficiently storing a collection of Boolean values. (For a thorough discussion of data structures, including a look at the most common collection classes in the .NET Framework, be sure to read An Extensive Examination of Data Structures; there's also a version for .NET 2.0.)

In addition to the classes found directly in the System.Collections namespace, there are a few more classes that can be found in the System.Collections.Specialized namespace. These classes, as the namespace name implies, are specialized versions of the classes in the System.Collections namespace. For example, there are a couple of strongly-typed collections found here - the StringCollection and StringDictionary classes provide strongly-typed versions of the ArrayList and Hashtable classes, respectively. There's also the ListDictionary and HybridDictionary classes that can be more efficient classes than the vanilla Hashtable class.

In this article we'll look at these specialized collection classes and see what features they offer over the base collection classes. Read on to learn more!

- continued -

The Strongly-Typed StringCollection and StringDictionary


The ArrayList and Hashtable classes are two of the most used collection classes by developers. An ArrayList maintains a variable collection of elements that can be indexed ordinally, just like with an array. (That is, you can do: myArrayList[integerValue] to get a particular value at a particular ordinal index.) Underneath the covers the ArrayList is implemented using an array of type Object.

The nice thing about an ArrayList is that you don't need to worry about sizing the ArrayList. With an array, on the other hand, you need to explicitly size the array and, if your storage needs outpace the array's size, you need to resize the array. With an ArrayList this resizing happens automatically in the Add(item) method. If you are adding a new element to the ArrayList and there is insufficient space, the ArrayList automatically increases its underlying array's size, hiding the complexity from you.

Since an ArrayList is implemented using an array of type Object, and since all types in the .NET Framework are derived, directly or indirectly, from type Object, an ArrayList can hold elements of any type. Strings, integers, decimals, DataSets, custom class instances... it doesn't matter. Anything can be chucked into an ArrayList. This has its advantages and disadvantages. It's nice because you aren't limited to storing a specific type of object in the ArrayList. However, this flexibility comes at a cost in both performance and code maintainability. By storing everything as an Object, when moving items in and out of the ArrayList the runtime must determine what type of data really is going in and out. For value types - integers, decimals, Booleans, and so on - this involves an additional overhead known as boxing. (A discussion of boxing and unboxing is beyond the scope of this article; refer to An Extensive Examination of Data Structures for a much more in-depth discussion on this issue.) There's also problems with code maintainability. By not having strongly-typed objects in your ArrayList you open the door to hard to catch run-time errors if you accidentally put in an object of the wrong type. With a strongly-typed collection, however, attempting to put in an invalid type will result in a compile-time error, which means you'll catch it when building your solution vs. not catching it until actually running the code (which, invariably, you won't test the particular exception-inducing case until you're giving a live demo to your boss and boss's boss).

The StringCollection in the System.Collections.Specialized provides a strongly-typed string collection. It behaves just like the ArrayList, the only difference being instead of being able to store items of any type, you can only store items of type String. Underneath the covers the StringCollection uses an ArrayList. It then provides the same set of ArrayList methods - Add(), IndexOf(), Remove(), and so on - but limits their input and return types to strings. Therefore, if you have a case where you need to store a collection of strings, I'd encourage you to use the StringCollection rather than the type-unsafe ArrayList.

Just as the StringCollection is a string-based, type-safe version of the ArrayList, the StringDictionary is a string-based, type-safe version of the Hashtable class. A Hashtable - sometimes called an associative array - is a mapping between a key and a value. Arrays maintain this mapping using ordinal keys (0, 1, 2, ...). With a Hashtable, however, the key can be anything - a string, a number, an object reference, and so on. Like an ArrayList a Hashtable stores values of type Object meaning that you can add any type to a Hashtable. The StringDictionary provides restrictions on types that can be inserted into a Hashtable. With the StringDictionary both the keys and values must be of type String.

Since the StringDictionary and StringCollection classes are merely type-safe representations of the Hashtable and ArrayList classes, their use is never required. That is, there is never a situation when you could be able to use the StringCollection but could not use an ArrayList instead. However, if situations to use these strongly-typed variations do present themselves I would encourage you to utilize these specialized collections.

Moving Forward: Strongly-Typed Collections in .NET 2.0
The .NET Framework version 2.0 introduces the concept of generics. With generics it is very easy to create strongly-typed collections that can be customized by type at compile-time. For example, with 2.0 you can create a strongly-typed list of integers using:
List<int> myListOfIntegers = new List<int>();

Essentially you can, when writing your code, extend the List class to use the type of your choice. There's no more need to create specialized classes in order to enjoy strong typing. (For more on generics be sure to read Introducing .NET Generics.)

More Efficient Dictionaries


A dictionary is an abstract data structure that stores a set of elements referenced by some key. (The Hashtable is a concrete implementation of a dictionary.) Dictionaries typically offer, at minimum, the following functionality: the ability to add an item to the dictionary, find an item by key, and remove an item. How these functions are provided depends on the actual concrete data structure. For example, when adding an element to a Hashtable the Hashtable maintains an array that actually stores these values. The way the value being added to the Hashtable is mapped to the array is that the key is converted into an integer value within the bounds of the inner array. The bottom line of this approach is that it's very quick to find, insert, or retrieve an element from a Hashtable. (The nitty-gritty details on how a Hashtable is implemented, along with a look at performance, is beyond the scope of this article - you can read about the details in Part 2 of An Extensive Examination of Data Structures.)

Another way to implement a dictionary is to use a linked list. (A linked list is a list of items; see Part 4 of the Extensive Examination of Data Structures article series for a primer on linked lists.) You can think of a linked list as an ArrayList. To add a new item to a dictionary being implemented using a linked list, simply add the new element on at the end of the linked list. Finding an element by key in the list, however, takes more work - with a Hashtable we could quickly find the element since there was a function to map the key to an array index; with the linked list we need to search through the entire list, one element at a time, until we find the key we're after, and then we can return that associated key's element value.

Clearly there's a performance difference between a dictionary implemented as a Hashtable and a linked list, with the Hashtable boasting improved performance in adding, finding, and removing elements. So it would seem to make sense to use a Hashtable for a dictionary in all circumstances, right? Well, no, actually. While the Hashtable's performance will outshine the linked list's, it only does so with sufficiently large collections. If you are only storing, say, five elements in your collection, the overhead that comes with the Hashtable will render it less performant than the linked list implementation.

For this reason the .NET Framework provides a dictionary implemented using a linked list, the ListDictionary class in the System.Collections.Specialized namespace. The ListDictionary is the optimal implementation to use for a dictionary when there are 10 or fewer items. When there are more than 10 items, it's best to use a Hashtable. So, if you know you are going to have 10 or fewer items in your collection and you need an associative array (as opposed to an ordinal one), use the ListDictionary class; however, if you anticipate more than 10 items, use the Hashtable.

Of course many times you do not know how many total elements you'll have in your collection. Thankfully the .NET Framework includes the HybridDictionary class. The HybridDictionary uses the appropriate implementation for a dictionary depending on the collection's size. That is, while your collection has 10 or fewer items, the HybridDictionary internally uses a ListDictionary to maintain the elements. Once you go over 10 elements, however, the HybridDictionary switches over to using a Hashtable. This switch from one data structure to another is done transparently to you, so you get the best data structure for the size of the collection without having to know how many items you'll have in advance or having to go through the headache of switching between the two implementations yourself.

Enumerating the Elements in a HybridDictionary
Like with all collections in the .NET Framework, you can easily enumerate through the elements in a HybridDictionary using a foreach loop. One thing to realize, though, is that the order of these elements will change when the HybridDictionary crosses over from the ListDictionary to a Hashtable. The ListDictionary stores its elements in the order with which they were added. However, a Hashtable's ordering is based upon how the key is transmogrified into the index used to store the value in the internal array. The short of it is that a Hashtable will not necessarily enumerate through its elements in the order with which they were added.

This behavior cannot only cause confusion, but also problems. The Page class uses a HybridDictionary to store the client-side script that has been programmatically added to a page by a page developer. If you rely on the order with which you add script blocks, you may be in for a nasty re-ordering surprise if you add more than 10 script blocks. See Peter Blum's RegisterScripts product for a more thorough discussion on why this is a problem and a potential solution.

Conclusion


In this article we examined some of the lesser known collections in the .NET Framework, those buried away in the System.Collections.Specialized namespace. In particular we looked at two strongly-typed collection classes, StringCollection and StringDictionary, as well as the HybridDictionary, which maintains a dictionary using the most efficient implementation based on the number of elements in the collection.

Happy Programming!

  • By Scott Mitchell



  • ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article