Readability In Programming

So, my fellow programmers, let us have a look at the most important aspect of programming. Namely, how to get a higher salary.

No, sorry, jokes aside :)

Let us instead have a look at readability, one of the single most important aspects of building a large application.

First, let me share some of my own philosophies on the subject. Bear with me on this. Imagine reading a good novel. The author ensures that the language is fluid and consistent. The sections are separated according to their content. They do not just jump to a new scene in the middle of a sentence. You will not have to backtrack the story to understand what is happening, and they will not use excessive words to convey the content. Less is more. Many of the same principles can be transferred to programming. In many ways, reading a good program should be like reading a good novel. Your code should be easy to read and easy to understand. You shouldn't need to read other parts of the code to understand a specific section. Things should be well-separated, and related content should be kept together. And, as in fictional writing, less is more. I have always been amazed by how simple and clean even very complex code can look if well written.

Readability is important because if, for example, you write a full video editor in nine months, you will have a relatively high code output, and there is no way you can remember off the top of your head what all this code does. You will have to use functionality you wrote months ago, and you will have to add expansions and changes to your classes. An alarming number of bugs and mistakes are made simply because the programmer does not fully understand how the code works. This obviously also has a lot to do with design, but that will have to be for another post.

The main areas of obtaining high readability can be divided into several subsections. The most important of these are:

Naming
Formatting
Structure
Comments

Apart from these well-known ones, I will add:

Keep it short and consistent
Keeping information to the left
Extensive use of wrappers
Organised headers (not applicable to languages like C# or Java)

Covering all the subsections of good readability will be too extensive for this post, so I will only go over them briefly here. I might cover them in detail in later posts.

Naming, formatting, structure and comments

Let us start with a classic example. How many can tell me what this code does?

// inc/dec vel

v += (a-((C*p*v*v*A)/2))*dt;

None? Really? I can tell you that ChatGPT got it right the first time.

It is of course a calculation of velocity. Doh.

float dragForce = DragCoefficient * DragArea * AIR_DENSITY / 2;

dragForce = dragForce * (_velocity ^ 2);

float resultingAcceleration = acceleration - dragForce;

_velocity += (resultingAcceleration * deltaTime);

Apart from applying PascalCase naming conventions to some more meaningful names, I added formatting to space code up and make it more readable, and I added a blank line to separate the force calculations from the acceleration calculations. From the naming, you can also see that DragCoefficient and DragArea are class-wide data members or properties (first letter is capital). AIR_DENSITY is a constant (all upper case), acceleration and deltaTime are most likely function parameters (first letter is lower case), and _velocity is the internal data representation of the property Velocity (starts with an underscore). The devil is in the detail.

Back in the day, some assembly geeks would say, “Yeah, but your code takes up five lines and declares local variables. The first code is much more efficient”, and for such a statement we have a technical term, and it is called “Los crappo del Torro”. Any modern compiler will optimise it, and there will be no performance loss. Furthermore, splitting the code up makes the code much easier to debug, as you can step through the calculation and check that everything is going according to plan. So never ever start to “pre-optimise” your code. The only three things you should keep in mind are readability, readability, and readability.

Oh yes, and about the comment.

The comment was a holdover from the old code, where velocity was simply calculated by adding or subtracting a value. The drag calculations were added later, but the comment was never changed because there were - well - no compiler errors. This pretty much also sums up why a lot of comments often are useless garbage. A comment should never explain what a line of code does simply because the code is unreadable. In the best case, it is bad programming. In worst case, it is directly misleading. Only ever add comments if they add valuable information needed to understand the code and which will not easily become outdated. Apart from that, good code is self-explanatory.

Keeping it short and consistent

A good rule of thumb is to keep names as short as possible. A good descriptive name is the shortest and most distinct name that accurately describes its function in the given context. No more, no less. This does NOT mean that it is okay to use 'rdy' for 'ready' or similar. Avoid abbreviations and obscure acronyms; names should be easily readable. Use acronyms only if the reader would be expected to know them, like JSON or UTC. Never make up your own, like TURD for "The Units Relative Distance", even if some programmers find doing stuff like that, absolutely hilarious.

Actually, there is a single exception to the made-up abbreviations. I always use a two letter uppercase prefix to all classes, as it makes it much easier to distinguish between classes and data member.

Engine Engine; // Some weird C# declaration

Engine.Load(); // Static or what?

RSEngine Engine; // A nice RockStar declaration

Engine.Start(); // Starts an engine instance

RSEngine.Create(); // Static creation of a new engine

As for method names, the same rules pretty much apply. Unlike class names, which are often nouns or noun phrases, method names are almost always verbs or verb phrases. Most importantly, method names should be consistent across classes. For example, always use 'Create' when returning a newly minted class. Consistency is key.

string fileName; // a filename without a path

string path; // a path without a filename

string filePath; // a complete path and filename

Again, use your own preferred and meaningful names. You would be surprised though, how often even seasoned programmers mix up basic stuff like this.

One thing to add here. If your names starts to get long (and they probably will), it might be an indication that you need to refactor your code. Let us say you are building an application to keep track of employees.

public List<RSEmployee> EmployeeList;

EmployeeList.Add(RSEmployee employee);

EmployeeList.Remove(RSEmployee employee);

The above lines are perfectly fine, and holds a list of the employees. But what if you want to maintain more lists. Following the approach from above, the solution would be.

public List<RSEmployee> HiredEmployeeList;

public List<RSEmployee> FiredEmployeeList;

public List<RSEmployee> InternEmployeeList;

Now it gets a bit more messy. The names are not that great (more on that in next section), and they are starting to get a bit long. In stead, a good solution would be to refactor the code, and introduce an employee manager. That would simplify the naming again, plus you can add a lot of highly readable functionality to that class.

public class RSEmployeeManager

{

private List<RSEmployee> _hiredList;

private List<RSEmployee> _firedList;

private List<RSEmployee> _internList;

// Methods

public void Hire(RSEmployee employee);

public void PromoteIntern(RSEmployee intern);

public void Fire(RSEmployee employee);

}

If your names starts to get long, it if often a hint that something could be improved.

Keeping information to the left.

Now, look at this beauty.

TzSpecificLocalTimeToSystemTime(&tzInfo, &stLocalTime, &stUTC);

The above line is some old Windows XP code. I dont know if they removed it in Windows 11. If they did not, it explains a lot. Nahh. Im kidding. Of course they removed it … hopefully. Apart from the long name, the horrible hungarian notation, and the way too generic use of parameter names (Info), the function name has another serious flaw, and that is that it is hard to understand what it does. The key to the functionality is buried around two thirds into the word, and consists of the two letter “To”.

When us humans read a word, we start from the left, and as soon as our brain settles on what the word is, we skip the rest. Same thing goes for an entire line of code. We really do. This means, that the further to the right the important information is placed, the greater is the likelihood that we miss it. In a more modern language, a readable version would be:

utc = slt.ConvertToUTC(timeZone);

Now, I am aware that back in the day when Mr. Gates did all the programming himself, they did not have fancy classes, type checking and stuff, so to some extend they were excused. Still, there was a lot they could have done.

SLT_ConvertToUTC(&slt, &utc, &timeZone);

As plain vanilla C doesn't have namespaces etc, a common way to circumvent this was by preceding with "namespace_". In the above case, all methods operating on specific local time, could be started with "SLT_".

If we look at a more modern example like the employee lists, the problem is that Hired and Fired are very close to each other … in so many ways. This means, that the longer the word gets, and the more to the right the difference is, the harder it gets to read it. Imagine if the names had been

public List<RSEmployee> EmployeeListHired;

public List<RSEmployee> EmployeeListFired;

public List<RSEmployee> EmployeeListIntern;

Now it got increasingly hard to distinguish the two, and especially with modern auto completion, it is very easy to select the wrong one. Any reader still awake, will notice though that any dot notation apparently fixes the problem (same with underscore in vanilla C).

EmployeeManager.HiredList;

EmployeeManager.FiredList;

That to a large extend makes it much easier to read, but if I ever were to make an employee application (God forbid), I would think hard and long about replacing one of the names.

EmployeeManager.HiredList;

EmployeeManager.LayedOffList;

Extensive use of wrappers

When I explain about wrappers and their usefulness, I often see programmers rolling their eyes, thinking, 'Come on, old man, don't waste our valuable time.' However, the truth is, wrappers are probably the single most powerful and underrated tool in the toolbox.

A wrapper's sole job is to call another function, to 'wrap' it. The obvious benefits for readability and consistency in naming are that you can rename anything to fit your preferred style. I've seen modules written in German - and I kid you not - included in an otherwise english application. Imagine the novel analogy: two writers co-authoring a book, one of them English, the other German. That would be some serious art nouveau shit. So, when I say 'extensive use,' I really mean it. If you're writing a large application, expect to wrap almost everything you didn't write yourself, including the standard libraries that come with your platform. At this point, some of you may start to think 'overkill,' but trust me, it's not.

The end result should be that when you look at a page of your core code, each class, method, and data member should be defined by you. Only then can you ensure 100% consistency in naming and style, and without consistent and high readability, developing a large and complex application becomes very challenging.

Luckily, wrappers offer many more benefits, just as important as readability. I'll cover these in a separate post. But the bottom line is, if you're not using wrappers, you should start now.

Organised headers (obviously not C# or Java)

So, this point is a bit more controversial, because I actually like headers. While I like C# a lot and think it has added some great features to the language (that's a big admission, coming from an old stomper like me), I really miss my headers.

The job of a good header was to expose the exact interface of a class. Nothing more, nothing less. With good, consistent naming, the header is all you need to understand what a class does. There's no need to dip into source territory unless you need to change something. Back in the day, when we printed code on paper (yes, actual paper), I used to print out the headers for reference. It was great until some douche decided to add a petrillion comments and run it through some obscure tool to generate documentation, which unfortunately ruined the main point of having clean headers. So, thank you douche!

I am aware that moving away from headers has its benefits like improved compile times and reduced duplication. However, the problem most programmers often encounter with headers, namely circular references, is not unsolvable. A circular reference occurs when ClassA includes ClassB and vice versa. This causes the compiler to have a bad day and is a very common mistake. The solution isn't forward declarations or other workarounds. It's to fix the underlying issue of the classes being interdependent. In this respect, consider circular reference problems as a hint that it's time for refactoring.

Bottom line: headers are not just files to drag along. They're extremely important, offering a complete overview of a class in a glance. Use them wisely, young apprentice, and may the force be with you.

Takeaway

Good consistent naming is crucial to building large applications, and reading your code, should be as easy to both read and understand, as reading a good novel. The functional core code should avoid third party code or platform calls, and the entire code should be named and formatted by you exclusively. Only that way can you ensure the highest possible level of readability.

/Lars

Search This Blog

Rockstar Programmer

Readability In Programming

Comments

Post a Comment

Popular posts from this blog

Data Driven Programming.

Examples Available On GitHub

Get new posts by email: