After symmetry, another aspect of the equality contract is .GetHashCode(). When you first override .Equals(object), the C# compiler will helpfully remind you that you must also override .GetHashCode().

Well, it’s not so helpful really, it’s a compiler error - you don’t get a choice about whether to implement it or not.

The relationship between .GetHashCode() and .Equals(object) is perhaps best demonstrated with some code:

// If the two instances are equal, they should have the same hash code
if (alpha.Equals(beta))
{
    Debug.Assert(alpha.GetHashCode() == beta.GetHashCode());
}

// If they have different hash codes, they cannot be equal
if (alpha.GetHashCode() != beta.GetHashCode())
{
    Debug.Assert(!alpha.Equals(beta));
}

Note particularly that these rules don’t reverse - having the same hash code doesn’t make two things equal, and being unequal doesn’t require different hash codes.

A Puzzle

See if you spot the subtle bug in the following semantic type …

public struct TimeSeriesId : IEquatable<TimeSeriesId>
{
    private readonly string _id;
    public TimeSeriesId(string id)
        => _id = IsValidId(id)
            ? id 
            : throw new ArgumentNullException(nameof(id));
    public static bool IsValidId(string id)
        => !string.IsNullOrWhiteSpace(id);
    public bool Equals(TimeSeriesId id)
        => string.Equals(_id, id._id, StringComparison.OrdinalIgnoreCase);
    public override bool Equals(object obj)
        => obj is TimeSeriesId id && Equals(id);
    public override int GetHashCode()
        => _id.GetHashCode();
}

Don’t worry, I’m not going to make you wait until my next post to find the answer.

But I do want to put in a bit of a gap so that you don’t accidentally see the answer before you’re ready.

Here’s a hint. Given the context of our current discussion, you know it has to relate to an inconsistency Equals() and GetHashCode().

But how?

Consider this example code - what would it output to the console?

static void Main(string[] args)
{
    var alpha = new TimeSeriesId("foo");
    var beta = new TimeSeriesId("Foo");
    var gamma = new TimeSeriesId("baz");

    Console.WriteLine(alpha.Equals(beta));
    Console.WriteLine(alpha.GetHashCode() == beta.GetHashCode());

    Console.WriteLine(alpha.Equals(gamma));
    Console.WriteLine(alpha.GetHashCode() == gamma.GetHashCode());

    Console.ReadLine();
}

You might expect it to output true/true/false/false, but it doesn’t.

It actual writes true/false/false/false.

The hash codes for alpha and beta are different, even though .Equals() says they are the equal. How can this be?

The implementation of Equals() is case insensitive (so “foo” and “Foo” are considered equal), but the implementation of GetHashCode() is case sensitive - it cares about letter case and gives a different value for each identifier.

Fortunately, the folks behind the .NET Framework have already solved this problem and the required fix is already present and ready for use. Here’s a conforming implementation of GetHashCode():

    public override int GetHashCode()
        => StringComparer.OrdinalIgnoreCase.GetHashCode(_id);

Testing

Assuming you have good unit test coverage of Equals() that covers all of the edge cases, you can catch this kind of error by exercising GetHashCode() with the same test data - consider these tests:

        [Theory]
        [InlineData("foo", "foo")]
        [InlineData("foo", "Foo")]
        [InlineData("foo", "FOO")]
        public void Equals_GivenEquivalentIds_ReturnsTrue(
            string left, string right)
            => Assert.Equal(
                new TimeSeriesId(left), 
                new TimeSeriesId(right));

        [Theory]
        [InlineData("foo", "foo")]
        [InlineData("foo", "Foo")]
        [InlineData("foo", "FOO")]
        public void GetHashCode_GivenEquivalentIds_ReturnsTrue(
            string left, string right) 
            => Assert.Equal(
                new TimeSeriesId(left).GetHashCode(), 
                new TimeSeriesId(right).GetHashCode());

Conclusion

A correct implementation of GetHashCode() can be tricky if you don’t take the time to think it through. Unfortunately, a minor bug here can lead to subtle bugs in your production system where it won’t simply fail, but will instead do the wrong thing some of the time. And those are the some of the worst bugs to track down.

About this series

Why does the implementation of Equality matter in .NET and how do you do it right?

Posts in this series

Why is Equality important in .NET?
Types of Equality
Equality has Symmetry
Equality and GetHashCode
Implementing Entity Equality
Implementing Value Equality
Types behaving badly
Prior post in this series:
Equality has Symmetry
Next post in this series:
Implementing Entity Equality

Comments

blog comments powered by Disqus