User Tools

Site Tools


notes:csharp:collections

Collections in C#

Interface Usage
IEnumerable<T> - Iterate over a collection
- Bind data to a list control
- Use LINQ methods
ICollection<T> - Add/remove items in a collection
- Count items in a collection
- Clear a collection
IList<T> - Manage the order of items in a collection
- Get an item by its index

If you need to create a custom collection, consider the following classes as your base class:

  • CollectionBase
  • List<T>
  • ObservableCollection<T>

List

You can use methods of the List class to:

  • add items to the list
  • remove items from the list
  • access an item by its index
  • find items
  • sort the list

If you know the initial number of elements at compile-time, you can specify the capacity of the list in its constructor or you can set the Capacity property.

Appending elements to the List is efficient but inserting can be slow (since all elements after the insertion point have to be shifted to make a free slot). Searching is efficient if the BinarySearch method is used on a list that has been sorted. Otherwise, it's inefficient because each item must be individually checked.

The Sort method uses internally the quick sort algorithm.

using System.Collections;
using System.Collections.Generic;
...
// Find elements in a list.
List<string> names = new List<string>() { "Ala", "Ola", "Lena", "Alina" };
int indexFirst = names.FindIndex(n => n == "Lena");          // indexFirst == 2
int index = names.IndexOf("Lena");                           // index == 2
                                                             // IndexOf uses IEquatable<T> to compare elements
string nameFirst = names.Find(n => n.StartsWith("A"));       // name == "Ala"
string nameLast = names.FindLast(n => n.StartsWith("A"));    // name == "Alina"
bool exists = names.Exists(n => n == "Ola");                 // exists == true
List<string> names2 = names.FindAll(n => n.StartsWith("A")); // names2 == { "Ala", "Alina" }
 
// Remove elements.
names.RemoveAt(2);       // remove the 3th element; names = { "Ala", "Ola", "Alina" }
names.RemoveRange(0, 2); // remove first two elements; names = { "Alina" }
names.RemoveAll(s => s.StartsWith("A")); // remove all elements starting with 'A'
 
// Convert a list of strings to a list of integers.
List<string> numstr = new List<string>() { "2", "5", "8" };
List<int> nums = numstr.ConvertAll<int>(s => Convert.ToInt32(s)); // nums = {2,5,8}
 
// Find only even numbers.
List<int> even = nums.FindAll(n => (n % 2) == 0); // even = {2,8}
 
// Add elements to the list using the AddRange method. 
// The AddRange method accepts an object of type IEnumerable<T>.
nums.AddRange(new int[] { 10, 13, 20 });
 
// Copy a collection to another collection.
List<int> numsCopy = new List<int>(nums);
 
// Add elements of a custom type to a list.
List<Point> points = new List<Point>()
{
    new Point { X=1, Y=2, Z=3 },
    new Point { X=4, Y=5, Z=6 },
    new Point { X=7, Y=8, Z=9 }
};
...
public struct Point
{
    public int X;
    public int Y;
    public int Z;
} 
 
// Add elements of a custom type to a list using a LINQ technique that creates a collection of objects 
// from a list of strings.
List<Book> books = new List<Book>();
 
books.AddRange(
new List<string>
{
    "C++ in 10 Days",
    "DirectX for Teenagers",
    "JavaScript Bible"
}.Select(b => new Book
{
    Title = b,
    Author = "not specified"
}).ToList());
...
public class Book
{
    public string Title { get; set; }
    public string Author { get; set; }
}  

Dictionary

Dictionary is implemented as a hash table, which makes retrieving a value close to O(1). Dictionary does not allow duplicated keys.

TryGetValue vs. ContainsKey:

  • Using the TryGetValue method is a very efficient way to retrieve values in a program that frequently tries keys that are not in the dictionary.
  • The ContainsKey requires an additional operation to retrieve the value corresponding to the key. Therefore, using ContainsKey is less efficient than TryGetValue.
using System;
using System.Collections.Generic;
using System.Linq;
 
...
 
Dictionary<int, string> words = new Dictionary<int, string>
{
    { 1, "House" },
    { 2, "Dog" },
    { 3, "Tree" }
};
 
bool b1 = (words.ContainsKey(2)); // true
bool b2 = (words.ContainsKey(8)); // false
 
string val;
if (words.TryGetValue(2, out val))
    Console.WriteLine(val); // Dog
 
// Enumerate dictionary entries.
foreach (KeyValuePair<int, string> word in words)
{
    Console.WriteLine("{0}: {1}", word.Key, word.Value);
}
 
// Convert a LINQ result to a dictionary.
Dictionary<int, string> words2 =
    words.Where(w => w.Value.EndsWith("e")).ToDictionary(w => w.Key, w => w.Value);
// words2 = { {1,"House"}, {3,"Tree"} }

Queue

A queue is a first-in, first-out (FIFO) collection i.e. the elements are accessed in the same order as they were added. The queue can be used to store data temporarily i.e. an item is removed from the queue when it is retrieved.

using System.Collections.Generic;
 
...
 
Queue<string> q = new Queue<string>();
 
// Add elements to the end of the queue.
q.Enqueue("A");
q.Enqueue("B");
q.Enqueue("C");
foreach (string s in q) Console.Write(s); // ABC
 
// Remove the oldest element from the queue.
q.Dequeue();
foreach (string s in q) Console.Write(s); // BC
 
// Return the oldest element, but does not remove it from the queue.
string p = q.Peek(); // B

Stack

A stack is a last-in, first-out (LIFO) collection i.e. the last item added to the stack is the first one to be removed. Just as with a queue, items are removed when reading them.

using System.Collections.Generic;
 
...
 
// Add elements to the stack.
Stack<string> stack = new Stack<string>();
stack.Push("A");
stack.Push("B");
stack.Push("C");
foreach (string s in stack) Console.Write(s); // CBA
 
// Remove the newest element from the stack.
string e3 = stack.Pop(); // e1 == "C"
foreach (string s in stack) Console.Write(s); // BA
 
// Return the newest element, without removing it from the stack.
string e4 = stack.Peek(); // e2 == "B"

LinkedList

LinkedList is a doubly linked list. In a doubly linked list each node references the node before, the node after, and the actual element.

Inserting elements can be efficient although finding where to insert the node can slow down the insert operation (each node must be traversed).

var list = new LinkedList<string>();
list.AddFirst("A"); // A
list.AddLast("Z"); // A Z
 
list.AddAfter(list.First, "B"); // A B Z
list.AddAfter(list.First.Next, "C"); // A B C Z
list.AddBefore(list.Last, "D"); // A B C D Z
 
list.RemoveFirst(); // B C D Z
list.RemoveLast(); // B C D
 
LinkedListNode<string> node = list.Find("D");
list.Remove(node); // B C
list.AddFirst(node); // D B C
 
foreach (string s in list)
    Console.WriteLine(s); // D B C

Other collections

  • SortedList - Represents a collection of key/value pairs that are sorted by the keys and are accessible by key and by index.
  • SortedDictionary - Represents a collection of key/value pairs that are sorted on the key.
  • HashSet - Represents a collection that contains no duplicate elements and has no particular order.
  • SortedSet - Represents a collection of objects that is maintained in sorted order.
  • BitArray - Manages a compact array of bit values, which are represented as Booleans, where true indicates that the bit is on (1) and false indicates the bit is off (0).

Use the SortedList class when you populate the list all at once from sorted data and you will rarely perform insertion or deletion operations on the data. The SortedList generic class uses less memory than SortedDictionary and it also offers better performance when the list is populated all at once from sorted data. SortedList is slower than SortedDictionary when performing insertions and deletions on unsorted data.

Both HashSet and SortedSet collections provide the Contains method. This method is executed fast as both of the collections use a hash-backed lookup.

Concurrent collections

Concurrent collections are thread-safe collections introduced in .NET 4:

  • BlockingCollection<T>
  • ConcurrentBag<T>
  • ConcurrentDictionary<TKey, T>
  • ConcurrentQueue<T> (FIFO)
  • ConcurrentStack<T> (LIFO)

BlockingCollection<T>

BlockingCollection abstracts the underlying data storage mechanism to a collection that implements the IProducerConsumerCollection<T> interface. When creating a BlockingCollection object, it is possible to specify the type of a collection to use. By default, ConcurrentQueue is used.

BlockingCollection blocks removal attempts from the collection until data is available for removal. For adding data, it is possible to enforce an upper-bound on the number of elements allowed in the collection. If that limit is reached, adding an element blocks the calling thread until space is available.

Always use the BlockingCollection methods to add or remove elements rather than directly calling the methods of the underlying collection.

The CompleteAdding method signals to the BlockingCollection that no more items will be added. If there are any threads waiting for new items, they won't be blocked anymore.

ConcurrentBag<T>

ConcurrentBag keeps all items in a bag ;) It stores items in no particular order and it enables duplicates.

ConcurrentBag implements IEnumerable<T> so it is possible to iterate over it using foreach. Thread-safety is accomplished by making a snapshot of the collection before iterating over it. As a result, newly added items are not available in the snapshot collection.

More examples

Subclass your custom collection from Collection<T>

Example: Traverse and display nodes of a linked list using an enumerator:

The IEnumerable interface defines a method GetEnumerator that returns an enumerator. Thanks to that the iteration logic can be farmed off to another class. Also, several consumers can enumerate the collection at once without interfering with each other.

using System.Collections.Generic;
...
LinkedList<string> list = new LinkedList<string>();
list.AddLast("AAA");
list.AddLast("BBB");
list.AddLast("CCC");
 
// Set the enumerator to a position before the first element.
IEnumerator<string> en = list.GetEnumerator();
 
// The first call to the MoveNext method positions the enumerator at the first element.
// When the enumerator is positioned past the last element, it returns false, 
// which causes the while loop to terminate. 
while (en.MoveNext())
{
    // The Current property returns the element at the current position. 
    Console.WriteLine(en.Current); // output: AAA BBB CCC
}

Collections

Interfaces

  • MSDN: IEnumerable - the most basic interface that a collection implement
  • MSDN: ICollection<T> - defines methods to manipulate a collection
  • MSDN: IList<T> - represents a collection of objects that can be individually accessed by index
  • MSDN: ISet<T> - provides the base for the abstraction of sets
  • MSDN: IDictionary<TKey, TValue> - represents a generic collection of key/value pairs
notes/csharp/collections.txt · Last modified: 2017/07/28 by leszek