Recently a friend of mine noticed some code in a book that was, shall we say, considerably sub-optimal. It’s worth looking at the code to see how both the performance and readability of the code can be easily improved.
Here’s the original code:
Aside, I’m not going to call out the author of this code by name, nor mention the actual book it came from. Doing so would be unfair, both because the author would have no way to respond, and because I haven’t read the book myself (which means there may be other factors at play that justify the authors choices). The point of this post is to see what we can learn.
What does this code do?
Given a sequence of words, the code counts how many times each word occurs.
For testing, I took a copy of World War Z, extracted the text, and ran it through using BenchMark.Net.
Method | Mean | Error | StdDev |
---|---|---|---|
Original | 1,806.73 ms | 35.070 ms | 51.406 ms |
For what it’s worth, here are the 16 most common words in the book:
Word | Count |
---|---|
the | 6758 |
to | 3292 |
i | 2588 |
of | 2548 |
a | 2540 |
and | 2475 |
was | 2131 |
that | 1727 |
it | 1576 |
in | 1408 |
we | 1196 |
they | 1005 |
you | 998 |
were | 959 |
had | 940 |
for | 904 |
Taking 1,800 ms to do a word count of the entire novel might be fast enough for some purposes. But, I believe the original code is not very efficient - we can do better.
The inner loop through Keys is entirely unnecessary - the class Dictionary<K,T>
includes ContainsKey()
, with an implementation that’s far more efficient than the simple linear search used above. (In fact, the entire implementation is substantially more sophisticated than a simple list of tuples.)
Using ContainsKey()
we can rewrite the code to this:
Observe how much shorter this code is than the original.
Running the benchmark again, we can see that it’s tremendously faster:
Method | Mean | Error | StdDev |
---|---|---|---|
Original | 1,806.73 ms | 35.070 ms | 51.406 ms |
UsingContainsKey | 15.10 ms | 0.250 ms | 0.234 ms |
In fact, it’s very nearly 120 times faster.
We can do better on the performance front, however, leveraging the way that TryGetValue()
returns a default value if the key isn’t found. We can rewrite the code without any branching:
This is marginally faster, at the cost of being a bit clever:
Method | Mean | Error | StdDev |
---|---|---|---|
Original | 1,806.73 ms | 35.070 ms | 51.406 ms |
UsingContainsKey | 15.10 ms | 0.250 ms | 0.234 ms |
UsingTryGetValue | 13.01 ms | 0.257 ms | 0.286 ms |
When talking performance, however, it’s always important to know How fast is good enough? The time taken by a developer writing the code - and by their colleagues maintaining it - are also factors.
For example, we can rewrite this method using LINQ extension methods and make it far easier to understand & maintain:
This isn’t as fast as the methods we’ve seen, though it’s still 70 times faster than the original. Arguably it is easier to read; the trade-off may be appropriate if the performance is good enough.
Method | Mean | Error | StdDev |
---|---|---|---|
Original | 1,806.73 ms | 35.070 ms | 51.406 ms |
UsingContainsKey | 15.10 ms | 0.250 ms | 0.234 ms |
UsingTryGetValue | 13.01 ms | 0.257 ms | 0.286 ms |
UsingLINQ | 25.51 ms | 0.493 ms | 0.782 ms |
Conclusion
What lessons can we learn from this?
Always measure performance. I had a strong hunch that the code was inefficient due to the unnecessary inner loop, but it wasn’t until I had profiled both approaches that I knew exactly how much faster it could be.
Leverage your class library. Doing things the hard way by writing more code might seem superficially easier at first, but it’s unlikely to pay off in the long run. It’s highly likely that your platform’s class library will have more efficient and more reliable solutions than what you might write yourself.
Be wary of blindly copying code. When you copy code, whether it’s out of a book, from a blog like this one, or from StackOverflow it’s vital to understand what the code is doing. Critical thinking is a vital part of every developers toolkit.
Addendum
A friend suggested also benchmarking ConcurrentDictionary<T,K>
:
The lambda expression will be allocated once, so this is relatively lightweight on heap allocation. The performance is very nearly as good as a regular dictionary, showing the cost of non-contested locks is negligible.
Method | Mean | Error | StdDev |
---|---|---|---|
Original | 1,806.73 ms | 35.070 ms | 51.406 ms |
UsingContainsKey | 15.10 ms | 0.250 ms | 0.234 ms |
UsingTryGetValue | 13.01 ms | 0.257 ms | 0.286 ms |
UsingLINQ | 25.51 ms | 0.493 ms | 0.782 ms |
UsingConcurrentDictionary | 21.18 ms | 0.386 ms | 0.361 ms |
Comments
blog comments powered by Disqus