You can get by without consideration of equality in .NET projects for quite some time without having too many problems. Eventually, however, the time comes when you have to dig into those odd problems and fix things properly.
Many of the fundamental utility types that are heavily used in .NET programs are reliant on a correct implementation of equality to work properly. Things will seem to work, but they won’t be doing quite what you expect. (You might find this discussion relevant even if you working on a different technology stack, as many of these ideas transfer across even if the implementation details might differ slightly.)
List
The .NET List<T>
is an ordered, indexable, collection of strongly typed items. Adding items to the list and removing them by index all works fine without explicitly defining equality.
But if you want to test to see if an item already exists in the list by calling .Contains(T)
, or if you want to remove a specific item regardless of index by calling .Remove(T)
then you’ll have problems.
By default, every class inherits a default implementation of equality that tests merely whether two object references are the exact same object reference.
Such a default implementation works well enough, at least until you start having to deal with serialization, caching, or a persistent store.
HashSet
A .NET HashSet<T>
is a collection of items that won’t admit duplicates. You can iterate items if you want, but don’t assume that new items will be added to the end of the sequence.
The duplicate check that is done whenever you add an item relies on the .Equals()
and .GetHashCode()
methods defined on object and assumes they have been overridden within your class if necessary.
As discussed above, the default implementation that every class inherits is good enough for many situations, but can (and will) fail you in some situations.
Dictionary<K,V>
Similarly, the workhorse Dictionary<K,V>
class has deep requirements for a proper implementation of equality.
Just as a HashSet relies on both .Equals()
and .GetHashCode()
, so does the dictionary when managing object keys.
Individual values don’t get away with a free pass, however, as the .ContainsValue(V)
method relies on equality.
What problems does this cause?
Without a correct and reliable implementation of equality, you might experience some very odd behaviour …
-
A set that contains multiple copies of the same value
-
A list where
Contains(foo)
returns false even though the list actually containsfoo
-
A dictionary where
ContainsKey(key)
returns false, even though the key has been used -
A cache that does nothing to improve performance because the required value is never found in the cache
None of these are particularly fun errors to encounter, particularly as they don’t involve nice clean exceptions but rather instead lead to your program doing the wrong thing, perhaps subtly.
Fortunately, it’s not actually hard to get equality right, you just need to think about it a little.
Comments
blog comments powered by Disqus