“How do I write good code?” was a question put to me by one of my direct reports a little over a year ago, in one of their one-to-ones. I was able to give a brief answer, but the question has been ticking over in my mind. A few days ago I launched this blog, and so I thought it would be a good opportunity to put some of those digested thoughts into written form.
There isn’t a simple answer this question, but it is still answerable. To get closer to an answer, we need to take into account a number of factors. We can do this by first seeking clarification: What do we mean by good? We can assess code on several fronts: ease of maintenance, speed, bugs and error-handling, and comprehensiveness. Let’s look at these in turn.
Ease of maintenance
Sometimes code is written once, to be thrown away. More often - the vast majority of the time, even - code is preserved to be run again repeatedly.
As business needs change, such code will need to be modified. Consequentially, it’s usually the case that more time is spent reading code than actively writing or changing it. Therefore it is better to have code that is easy to understand than it is to have code which uses quirky or obscure behaviour. Although such tricks are sometimes necessary, for example they might be faster; see the section on Speed below.
Speed
Code doesn’t always have to run quickly. Is this a one-off process or will it be repeated? Will there be a human waiting for the code to complete or will the code run unattended. Imagine a scenario when a monthly mailshot is sent out and that this script takes 60 minutes to run. If it needs to be launched manually, and runs locally on someone’s machine, then that person can’t shut their machine down whilst the script is running. But if it completes in 60 seconds then this is less of an imposition. Better would be for such a regular task to be on dedicated infrastructure, though this is perhaps a separate factor from that of speed. But returning to the issue of ‘dead time’: if a script takes 5 minutes then that’s a useful amount of time: one can make a coffee, go to the loo, etc. But 30 minutes is an awkward amount of time because it’s too short to work on anything else, because that task would need to be interrupted to resume the task being done that necessitated the script in the first place (the mailshort, in this example).
Let’s take another example. Imagine a website - perhaps it’s an online retail website. Numerous studies indicate that visitors will abandon a website if it is slow to respond. So there’s a financial and reputation impact to the organisation to have performant software.
Bugs
Except perhaps in subversive contexts it is rare to intentionally add bugs to code. It’s generally agreed that once identified bugs should ideally be fixed, and if not immediately fixable then the bug should be recorded in a tracker, somewhere central and ideally linked to the code repository such that future people making use of the code can find the list of defects easily.
Whether a bug is high or low priority depends on its severity and/or impact, and how often it comes up.
A bug with minimal fallout that doesn’t happen very often (e.g. “This script
does not work on any leap day”) might easily be postponed if it is non-trivial
to fix. But a bug such as “The payment processor will abort the entire batch if
any of the recipients have an apostrophe ('
) or hyphen (-
) in their
name1 is much more critical and would likely need to be prioritised.
Error handling and reporting
This is closely related to the Bugs section but it’s worthwhile discussing separately.
Let’s take the previous example of hyphenated names. It could be that the whole
batch fails with a cryptic error “Batch failed”. Or it could be that the whole
batch fails with a more useful error “Record 437 failed due to unrecognised
character in name Malcolm Wynn-Jones
”2. Hopefully it’s clear why this kind
of error is preferable.
Another kind of failure mode would be for the other records in the batch could succeed and solely this problematic record go into a queue for human attention, for manual processing. This is, perhaps, the preferable result, so long as the fact that there are problematic records is somehow made known, rather than them falling silently into the void.
A truly awful outcome, which would leave a lot of manual cleanup, would be if in the event of a hyphenated name aborting the batch processor, the records up to the problematic record were correctly processed and furthermore that the error reported be the less useful “Batch failed”. Why? Because the natural response (to someone who doesn’t know this script’s cursed behaviour) would be to run the entire batch again – because they have no indication that some of them did, in fact, succeed. This would lead to load of double payments. Problems such as this can be mitigated with staff training, sure, but people will always leave and join so it’s better to have error messages be as useful as possible. Plus in the event of a rare error, the specifics of a given script’s foibles are easily forgotten so having the error message be descriptive is a good thing.
Comprehensiveness
Imagine a process with 5 steps. Does this script do all 5 steps, or does it do the first 3 and still leave the final 2 to be done by hand? It might be the case that the script is new and hasn’t been finished.
But the downside of a script not being complete is that it’s much more likely for a step to be omitted or carried out wrongly. This risk diminishes (well, bugs notwithstanding) if the script covers all steps.
In the absence of a script automating all of the required steps, a useful safety net is to have a document with a well defined procedure, including a checklist. Designing a checklist is itself an art; NASA have even conducted researched into the human factors of what makes a good (or in their words “normal”) checklist.
Closing notes
A thousand or so words is more than I was expecting to write! Writing perfect code is almost impossible to achieve. Writing good code? Sure, that’s doable. But first let’s decide on what we mean by “Good”.