Generating Hash Codes

Generating a hash code for an immutable priority queue introduces an unusual challenge: How do we generate a consistent hash code regardless of the order items are added to the queue?

Remember from recent posts that we’ve decided that two queues are equal if they return the same items in the same sequence and have the same prioritization rule.

One approach to implementation would be to iterate through the queue, calculating the hash code by combining the hash codes of our contained items in sequence. This would be very heavyweight, so we’d need to cache the value once calculated. With a little allowance to ensure we weight values at the same priority equally, the code would look like this:

private int CalculateHashCode()
{
    var result = _comparer.GetHashCode();
    var index = 0;
    IImmutablePriorityQueue<T> cursor = this;
    T item = Head;
    while (!cursor.IsEmpty)
    {
        if (_comparer(item, Head) != 0)
        {
            index++;
        }

        item = Head;
        unchecked
        {
            result += item.GetHashCode() * index;
        }

        cursor = cursor.Dequeue();
    }

    return result;
}

The problem with this approach is that hash codes are supposed to be lightweight, and this is most definitely not.

As an alternative, we could calculate the hash code up front, in our constructor. The challenge with this is that we need to ensure that our calculation gives the same result regardless of the order items are enqueued. This limits us to just a few choices of operation - addition, multiplication and exclusive-or.

To illustrate, let’s just calculate our hash codes by summing up the items in our queue. For our DualItemImmutablePriorityQueue<T> implementation, we add a new member, some code in the constructor to initialize the member, and override GetHashCode() to return the precalculated value:

// Our new member
private readonly int _hashcode;

// Add this into the constructor
_hashcode =
    Head?.GetHashCode() ?? 0
    + _rest.GetHashCode();

public override int GetHashCode() => _hashcode;

We avoid the possibility of a NullReferenceException by protecting the call to Head.GetHashCode() with the elvis operator.

The implementation for BranchingImmutablePriorityQueue<T> is very similar:

// Our new member
private readonly int _hashcode;

// Add this into the constructor
_hashcode =
    (Head?.GetHashCode() ?? 0)
    + _left.GetHashCode()
    + _right.GetHashCode();

On reflection, I’m not sure which of these two approaches is superior. Calculating the hash code up front is likely more performant, but the quality of the result might not be particularly high. Iterating through the entire queue to calculate the hash code will return a much higher quality result, but the cost of calculation may be excessive.

What do you think?

Immutable Priority Queues
Generating Hash Codes

About this series

Posts in this series

Comments

Immutable Priority QueuesGenerating Hash Codes

About this series

Posts in this series

Comments

Immutable Priority Queues
Generating Hash Codes