Two Hard Things

I have been known to utter this quote from Phil Karlton from time to time and would like to spend some time talking about why I think it is as important as it is amusing.

There are only two hard things in Computer Science: cache invalidation and naming things.

\- Phil Karlton

In this article I will concentrate on the second part - naming things in computer science, or more explicitly naming things in code.

Naming things well, be they classes, functions, variables, even folders full of code can make the difference between being able to understand what the hell is going in the code you write or not. It may be the human who sits next to you, on the other side of the world or even future you that wants to quickly "get" the purpose of your code without having to have some magical assumed knowledge.

Assumed knowledge is the bane of all corporations big or small. No one knows what the future will hold, who will be maintaining this code in the future? Business cannot and should not stand for the code they own (they paid you to write it didn't they?) to not be comprehensible by any suitably skilled engineer.

That sounds like a statement many would find hard to disagree with, so why is it that there remains a lot of code out there that seems almost deliberately obtuse? It's because naming things is hard but it also why naming things well is important but mostly it's because it's written by humans. Humans are fallible and applying good naming and structure to the code you write is learned skill over and above the skill to actually make a program work efficiently. It is also, an extremely valuable one.

I speak from experience, which I'm sure every software engineer has had at least once in their career, of going back to some code you wrote x years ago and being mystified about the true intent of the code. After being bitten once too many times by this I resolved to take extra care in how the code reads, as if each page of code were a page of a book (ok I might be stretching it a bit here but you get the point).

As an aside, I will never forget the time I managed to create a good chunk of a database schema that dealt with organisations with the spelling "orginisation" everywhere (at least I was consistent in my ignorance). It got past code review, no one noticed.It went into production.To my sinking horror and shame I noticed it 2 months later. By this time everyone was nervous about changing such a critical set of tables, foreign keys, indexes etc all with my typo in it.To this day that schema exists and the guys that maintain it now have just "got used to" the innovative spelling.

Here is another lesson I have learned - it should be simple to refactor code to change a name but in reality it is actually quite rare that it feels important enough to risk. To be fair the risk lies mainly at the boundaries - think REST API contracts and database schemata but that can lead to an even worse situation where everything in the middle now has a different name to the boundaries. The pragmatic choice can come down to just accepting the original "names" a lot of the time.You may not think it important as you are initially writing it but you should always bear in mind that code evolves.

You cannot tell what will happen in the future. Most code you write now will in some way become embellished, added to or reused. The number of times you can write something once and treat it like a black box forever is really small.

If we say that the names you are using now will probably remain into an unknown future then getting the names right initially is important.I can't really tell you what the names should be because I don't work in the same context as you, all I can really say is stop and think about what "things" are called(I am looking at you Mr Foo Bar). As soon as you start attributing importance to the concept it actually becomes second nature.

So, pay attention to what you call things, it matters!

Richard Andrews - 02/11/2020