I’ve been doing a lot of maintenance work lately, making minor changes to existing systems written in Visual Basic and C++. The experience has prompted me to think about the day to day decisions we make as developers.
Introduction
Software development is a complex job, requiring many decisions to be made.
Unfortunately, many of these decisions are made on the fly without any care or consideration, resulting in subsystems that are needlessly difficult to use and understand.
In this article, I want to focus on one area - designing a classes external interface - and show how a few minutes informed thought can dramatically improve the external interface of a simple class.
Motivation
The strategic levels of development get a lot of attention. Design patterns, architectural structures and related topics are the subject of many books, white-papers and conferences. These areas are deserving of detailed attention. Without careful strategic thought, most projects are doomed to eventual failure.
Yet, there just aren’t that many strategic decisions to make on any one project. Most of the decisions made on a day to day basis are simple ones. Activities such as method and variable naming, coding styles and refactoring of existing code are essential, but hardly strategic.
These day to day, or tactical, decisions are much less heralded. This is likely because they are inherently less sexy. After all, it’s hard to sell a book or organise a conference about naming conventions or coding styles, let alone run a class on how to name an index variable.
Class Design
The external interface of a class is critical. A well designed interface enables developers to get their jobs done quickly and reliably. Conversely, a poor interface is likely to slow development and be a source of both bugs and stress.
Careful attention to the external interface of a class can make your own life simpler, as well as that of your colleagues. This impact doesn’t just occur today, on the current project - but echoes wherever your class is (re)used in the future. It may be your own job you make easier in six months time when you come back to address a lurking bug.
Classes with well designed external interfaces are easier to understand, easier to use and easier to re-use. A overly complex external interface often deters new developers from making the investment of effort to understand and use the class. The (often minimal) extra design effort required to make a class simpler is almost always repaid, often within the same project or iteration and sometimes the same or next day.
Our example will be a simple collection class - a list of States within a state machine, each having a unique name. The same principles outlined here could be equally applied to other situations.
A fairly classic design for the class reads something like this:
Note that we’ve concentrated on only two of the methods of the class, and have ignored issues such as indexed access vs iterators and other issues.
A fairly simple interface, clear and easy to read. What problems could there be with just two methods from a simple class?
Problems
There are a few problems for us to explore, including two of them that relate to what the expected behaviour should be if a passed Name matches (or does not match) a state already in the Collection.
Problem 1
If IndexOf()
is passed a Name that matches an item in the collection, the index of that Item should be returned. However, what should happen if the name matches no existing State?
-
Throw an Exception. Implies the client must not enquire about a state that isn’t present.
-
Return a sentinel value (say, 0 or -1) to indicate not found. Implies client must be prepared to handle a missing State; also requires any developer reading the client code to know or recognise the sentinel value in use.
-
Add a new item with the passed name and return it’s index. Implies that what the client code wants, the client code gets, even at the risk of unexpected behaviour.
Problem 2
If Add()
is called with a name that does not match any state already present in the list, a new State should be created and a reference returned.
What should happen if there is already a state with that name in the list?
-
Raise an exception. Client code should never try to violate our invariant that names are unique.
-
Return a new item with the passed name. We expect client code to not violate our invariant by checking first. Instead of enforcing our invariant rigidly, rely on client code not to do anything that would cause problems. If the client does break our invariant, make no promises about our behaviour.
-
Return the existing Item. Maintains our invariant, but the semantics no longer match the natural english meaning of
Add()
. -
Return null to indicate that the Add failed. Implies that client code must be prepared to handle a null result from
Add()
; also requires that any developers are aware of the special case behaviour.
A First Step
An approach to address the questions of expected behaviour raised by (1) and (2) would be to make the policy part of the interface. This gives client code full control over the edge case behaviour of each method.
One way to do this:
One immediate consequence of these changes is that the external interface of the class is now substantially more complicated. Instead of having to understand just two simple functions, each with one parameter, a developer needs to learn about two enumerated types and two moderately complex functions, each with two parameters (one of which can dramatically alter the behaviour of the method).
Yeuch!
A Second Step
Some simple advice can guide us in improving the class interface a little:
If you have a method where a boolean (or enumerated) parameter is used to control the behaviour of the method, it’s likely that you actually have multiple methods that should be separated.
I thought that this advice came from Steve McConnell’s Code Complete, but when reading a copy recently, I couldn’t find a reference. Yet, I remember clearly reading this in some book or another years ago. A friend of mine said that it sounded like the kind of thing that Martin Fowler wrote in Refactoring, but I don’t find my copy of that either. Oh well, it’s still good advice.
Applying this to our current situation gives us this:
Is this any better? We no longer have any enumerations to worry about, nor any methods with two parameters - just seven (7) functions that each take one parameter. Since each method can be understood in isolation (without having to understand all of the other methods as well), this interface is probably better than what we had.
Just.
Fortunately, it puts us in a better place to improve things substantially.
A Critical Eye
With some careful inspection and thinking, there are a number of simple, yet useful, observations we can make about this interface.
Duplicated Functionality
Note that IndexOfWithCreation() and AddOrReturn() actually do the same thing, differing only in their return value: They both allow someone to obtain an item with a given name regardless of whether the item already existed. Further, the return value from IndexOfWithCreation() only has use to obtain the item through an indexed accessor.
We can replace both of them with a simple method:
Why “Request”? It stems from the idea that we don’t care whether the state preexisted or was created on the fly - we are simply requesting a state with a specified name, and the function provides that for us.
Similarly, AddNew()
and AddPossibleNew()
are methods that do the same thing, this time differing only in the way that they report failure. Raising an exception is popularly considered to be the best way to report failure, especially if that failure is unusual or avoidable.
We can replace both of them with a single function:
Violation of Invariant
Allowing external code to unknowingly violate our class invariant is a good way to introduce extremely subtle bugs. It’s also a good way to blow up in unexpectedly severe ways. Either way, a violated invariant is bad news.
Consequently, we should discard AddDuplicate()
without further consideration.
Sentinel Values
Sentinel values are often regarded as a code smell. In most cases, the sentinel can be eliminated by returning more than one value from the function.
For IndexOfPossible()
, the two return values are the index of the item, and an indication of wether the item was found. One straightforward way to separate these two pieces of indication is:
The method AddPossibleNew()
also returned a sentinel value. Even though we’ve discarded it as unnecessary, we can still consider how it could be handled. The two return values are a new state, if any, and an indication about whether Add()
failed because the name was already in use.
In a similar vein to the above reformulation we could suggest:
A further insight can gained by thinking about the possible uses of the routine.
If the function result is true, the calling code has a state with which to operate.
If the return result is false, the calling code may take a different code branch (in which case Contains()
may be a better choice), or create the item (in which case Request()
would be better).
In either case, we have no need for the method, confirming our earlier decision to remove it.
A Final Class
Bringing these different observations together, our class ends up looking like this (I took the liberty of renaming IndexOfExisting() as well):
How does this compare to our starting point? Four functions instead of two, only one of which requires multiple parameters. Each method is simple, independent, and easily understood.
The reliance on indexing into the collection has been reduced as many uses of the IndexOf()
function are subsumed by Contains(). In fact, with an appropriate iterator, IndexOf() may become further redundant.
Lessons
With just a little thought and effort, a simple and extremely typical class interface was refactored into something better.
The challenge is to apply this thinking to the classes we work on every day.
When starting a new class from scratch, think about the way that clients are going to use the class and try to make their life easier from the start. I have found that Test First Development can be a real help here, as it forces you to consider the external interface before writing any of the internals. Of course, YMMV.
Existing classes shouldn’t be ignored. When you add a new method to the external interface of a class, consider the impact that has on methods already present. Does the new method make an existing method less useful or redundant? How should existing clients make use of the new method?
Further Reading
Comparatively few resources give good advice on the tactics of good programming. Four of the best are:
- Refactoring by Martin Fowler.
- Code Complete by Steve McConnell.
- Smalltalk Best Practice Patterns by Kent Beck.
- The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and David Thomas.
I believe that these four books should be on every programmers bookshelf. Read them.
Comments
blog comments powered by Disqus