Coding Standards

Coding standards are a significant part of most commercial programming environments. For good reason. Some organizations simply adopt the published standards from an existing major player, such as Google, but many will customize a set of standards to meet their own needs. The following discussion covers the basic issues that a coding standard needs to cover. There are links to actual coding standards for specific languages at the end.

The Basics

The basic things that a useful coding standard should have something to say about are:

  1. Naming conventions
  2. Code formatting (including white space)
  3. Programming techniques
  4. Documentation requirements

Each of these sections will contain many requirements to which you will be expected to conform. This can initially seem like an onerous burden, but the benefits to the team from everyone writing their code to the same standards cannot be overstated.

Good tools can automate significant portions of a coding standard. For example, the Eclipse Java IDE will automatically format the layout of your code according to a set of rules that can be customized to meet most requirements. Eclipse can also generate skeleton comments that meet the generally recognized format for automatically generating Java documentation.

Naming Conventions

These cover everything in your code that will need a name: the packages, classes, interfaces, methods, variables and so on.

It used to be common practice to make the names of variable say something about what they contain and what they might be used for. This is known as Hungarian Notation. An example would be to name a variable used to hold a string input by the user telling you their name something like 'sinUserName'. Here, the 's' tells you its a string, the 'in' tells you it was input and the 'UserName' bit tells you its their user name. Most coders these days would probably just call that variable 'userName' because the IDE will be able to tell you that it's a string.

There are a number of reasons Hungarian Notation has generally fallen out of favour with programmers, as discussed at the end of the Wiki article linked to above. But it can be a bit more complicated than this discussion implies. There is more than one dialect of Hungarian Notation, and Joel Spolsky makes a good case for a form of Hungarian Notation in his blog here...

Although most coding standards will specify a naming convention other than Hungarian (of any sort) one place you probably will see this notation in use is in the code automatically generated by GUI building utilities. For example, if you add a radio button using WindowBuilder in Eclipse, it will be given an initial default name like rdbtOne.

Code Formatting

This part of a standard specifies how lines of code are to be indented relative to other lines, where brackets should be placed, how many blank lines are to be used under certain circumstances, and so forth.

Exactly how code is to be laid out can be a matter of some contention. It might not seem important, but you might be surprised how much discussion can be generated about the simple differences shown in the following example:

public void method(int arg0) {
    if (arg0 > 0) {
        // more code here
    }
    else {
        // rest of the code here
    }
}
public void method(int arg0) 
{
    if (arg0 > 0)
    {
        // more code here
    }
    else
    {
        // rest of the code here
    }
}
Example One (a) Example One (b)

There is no difference in meaning between the two examples, they are just using white space differently. It probably doesn't matter which of the two formats is actually used, just so long as once it's been agreed which is to be used, that each programmer sticks with the standard.

Some aspects of this topic do have very general agreement. For example, pretty much everyone involved with a C-family language would agree that of the following two options (a) is greatly preferred over (b):

if (a > 0) {
    doSomething(a);
}
if (a > 0)
    doSomething(a);
Example Two (a) Example Two (b)

Again, there is no difference in meaning between the two cases. However, the (a) example is considered to be easier to maintain than the (b) example. The Google style guide is very clear on this matter. You can find a lot of entertaining discussion of exactly this example here (I told you this stuff generated a lot of heat).

Luckily, this is one of the aspects of a coding standard that can mostly be easily enforced by the editor software.

Programming Techniques

Most languages have little gotchas that can trap the unwary programmer into a silly mistake. A good coding standards document should have sound advice on the techniques to avoid. For example, although there is a feature called finalizers in Java, most code standards (including the Google one) say never to use them because they are fraught with problems.

Doug Crockford's excellent book JavaScript: The Good Parts is really a detailed description of good programming practice for the JavaScript language. Joshua Bloch's book Effective Java does the same thing for Java, although this doesn't include the latest features in Java 1.8 such as lambda expressions.

You might reasonably conclude that the better the language, the shorter this section of a coding standards document should be. And it is the shortest section in the Google Java Style guide, with only four elements. However, the CERT Secure Coding Standards is almost entirely devoted to programming techniques. As the name of standard indicates, these techniques are aimed at helping the developer create secure, bug free programs.

Documentation Requirements

One advantage of Java is that there are automated tools that will take specially formatted comments in the source code and generate HTML formatted API documentation from them. This is the Javadoc facility. The great advantage of this is that a great deal of the documentation can be written at the same time as the code itself. This also potentially makes it easier to keep API documentation up to date: any time there are significant changes to the code base, the comments should also be updated which, in turn, means that the API documentation will be updated the next time the Javadoc command is run. Similar tools exist for a variety of other languages.

Commenting code actually serves two purposes. The purpose implied above is for users, or clients, of the code base. They will need to know what facilities are provided by the package, how to invoke those facilities, and what kinds of things can go wrong. This information is the "public" or "published" API of the package.

The second purpose of commenting code is for the developers of the code. If you are the only person working on a project, you might think there are no need for such comments. However, the chances are that you will forget important aspects of your coding choices over time. So you should always comment your code with a view to explaining the salient aspects to a stranger. It is often said that the code tells you the "how" of the application, but the comments should explain the "why" of the specific techniques employed.

Finally, in a good IDE such as Visual Studio or Eclipse, the API comments on distant pieces of code such as a method definition will show up as a tool tip when you hover the mouse over an invocation of that method. That is just so useful in a project with a lot of code! You can find an example of Javadoc comments and how to add them automatically in Eclipse here...

As an example, you can find the complete documentation for the standard public Java API here...

You can find detailed technical instructions on how to write Javadoc comments here...

And some advice from a professional programmer to students on how to write comments... teachers of programming, please take note!

Students are generally required to comment their code. I typically insist on good API level comments and sufficient in-line comments to make the obscure clear. This is usually an assessment requirement, so there isn't really a choice. However, there is a fair amount of discussion about how useful commenting is in the "real" world. The following series of articles in the Visual Studio Magazine contain a good example of the nature of the debate.

Standards for Various Languages

The following pages provide links to some detailed discussions for particular langages: