As anyone who has seen my presentation Becoming a Better Developer will know, the anti-pattern primitive-obsession describes a practice that encourages the proliferation of bugs. The best way to counter this problem is to introduce semantic types.
FileInfo class captures
information about a file. If the file exists, we can access information about that file, such as
its location and size. But, we’re not required to only use
FileInfo when the file already
exists - we can construct a new FileInfo to represent a file we are going to create, or even a file
we’re looking to find.
captures information about a particular instant in time. The designers of the .NET framework didn’t
take the approach of Excel, and just use a double to represent a timestamp - perhaps they realised
the flaws in that approach.
Why use Semantic Types?
The key with a semantic type is to isolate values of one type, to specifically define that they are a unique thing unto themselves.
Why would you do this?
Consider this example from my presentation:
By using a parameter of type
string, this method becomes somewhat ill defined. The obvious
problem is that you pass any kind of string - a username, phone number, email address or ISBN. More
subtly, what kind of destination is expected? Sure, it might support a file path - but what happens
if you supply a
Being explicit about the required type makes it clear that the method is expecting a file path:
It also allows for an overload to add explicit URL support:
Defining your own
Writing your own semantic types has never been particularly difficult, but the new language features in C# 6 make this particularly easy.
Consider a software system where every organisation has a unique identifying code, known as the Organisation Code that is used throughout. So, the “Toyota Motor Corporation” has the code “TOYOTA”.
A very simple class to represent this code might look like this:
Note that this is an immutable class using the new syntax for a readonly auto property.
While this is a good start, it’s not sufficient. To interact correctly with other classes in the
.NET framework such as
our semantic class needs to implement
Similarly, for it to work with
it needs to implement
What else do we need?
Protecting our system from invalid data is important. Defense in depth is a worthwhile approach, one that can make our system resiliant in the face of attack, even when tainted data makes it past our surface layers.
Let’s add a method we can use to check the validity of a code, and then use that in our constructor to ensure that we never wrap an invalid code.
Here we have a well functioning .NET class, one that meets our requirements. We can now use the class as a parameter and rely on the compiler to ensure that we don’t pass the wrong thing:
However, that line including new looks a bit cumbersome. Fortunately, we can do better by introducing support for a couple of typecasts.
With these in place, converting a string into an
OrganisationCode looks like this:
Questions and Answers
Isn’t this a lot of work?
While it’s a non-trivial amount of code to write, I’d suggest that the clarity semantic classes introduce through documentation, coupled with the bug elimination you gain from the enforcement of strong typing works out to a net positive.
What about implicit conversions?
If you modify the explicit conversions and make them implicit, you get rid of the typecasts - and you also defeat the enforcement of strong types, reopening the door to easy mistakes where the wrong value is passed as an argument.
Wouldn’t a struct perform better?
The case can be made that using a struct for small semantic types instead of a class would be a good idea because it lessens the number of heap allocations, and therefore reduces the amount of work requred of the garbage collector. In most cases, I suspect the actual performance impact would be nigh on immeasurable, with any differences swamped by other effects. The key, as always with performance issues, is to rely on objective performance measures, not subjective handwavium. If you think a struct would perform better, measure the results and find out for sure.
Updated 28/9 - fixed a bug in the implementation of GetHashCode()