Writing Good Code
Contents
Why this page
When tutoring students for their theses or projects, I often find many problems with the code they write. I'm not referring to bugs; bugs happen, as flu happens, although there are some things you can do to make them more unlikely. The problem discussed here is Bad codeTM, i.e., code that nobody can read or understand, not even its author. So I've decided to write this page with some advice on how to write Good codeTM. Please note that everything you find in this page should not be considered as strict rules, as it expresses the point of view of its author(s); but a reasonable piece of advice needs a good reason to counter it, in order not to follow it; so at least think about it. Contributions are welcome.
- --Bernardo
General
I Think Therefore I Program
And the inverse is true: you cannot program without thinking. So, before rushing to the keyboard, take a piece of paper and try to lay down your ideas. A clear idea of the structure of data and the algorithm you want to apply is important to write a properly working program. Think about how you're going to use the data and shape the data structures accordingly; think about how to factorize your algorithm, what functions (methods) and what parameters you need.
Comments
A big problem is the use of comments. You can encounter projects with hundreds of lines of code and not a single line of comment, so that it is hard to guess even the general purpose of the program, or projects with ultra-verbose comments, like this:
k = k + 1; // increment k
Yeah, thanks, I thought it was extracting the square root of k.
Comments are important, and their correct use improves the quality of the code, even when there are no comments. Really! The source of the program is not only a way to obtain a binary that your computer can run, it is a way to express ideas. They should be clear to the compiler, so it compiles and you have your nice executable, but they should be clear also to anyone that reads your code, including your fellow students, your supervisor, and also you — even after a couple of months.
A tangled piece of code is not a way to clearly convey your ideas; and a comment added to a tangled piece of code doesn't make it much clearer. Besides, if you can't understand a piece of code, does some comment raise your confidence that the code is really working? Tangled code is more likely to be bugged, and more difficult to modify, so please write your algorithm in a way that it can be understood just be reading the instructions, not the comments. There good places where to write obfuscated code, and a thesis is not one of them.
Just an example (taken from real code! names have been changed to protect the innocent):
#define NUM_IT 3 for (j = 0; j < NUM_IT; ++j) { if (j == 0) { // First iteration // Do something ... } else if (j == 1) { // Second iteration // Do something else ... } else if (j == 2) { // Third iteration // Do things ... } }
This is a rather confusing way to write a simple sequence:
// Do something ... // Do something else ... // Do things ...
So, what to comment? Good places for comments are functions (you tell what the function is about, what the parameters are, and describe the expected results and side effects), global or important variables, the beginning of a file, and a few others. Also, when you take a tricky decision, document it!
Indentation and spaces
In most languages, spaces and indentation are not part of the syntax, and they are mostly ignored by the compiler (Python is a significant exception), yet indentation and spaces are very useful to format your code and make it more readable. There are many way to use them, but the first rule is 'consistency'. Choose the style you like the most, and stick to it; particularly, choose if you want to use spaces or tabs for indentation, and be consistent, otherwise when someone else opens your project with a different editor with a different idea of tab length, your nice (you made it nice, didn't you?) indentation will be screwed up.
C and C++ code
The Linux kernel coding style is an interesting reading, particularly chapters 3 (Placing Braces and Spaces), 4 (Naming), 6 (Functions), 8 (Commenting), 12 (Macros).
For header (include) files, avoid problems from multiple or ricursive inclusions. Wrap their content in a #ifndef
...#endif
construct (inside the header file!), like this:
#ifndef MY_PROJECT_MY_HEADER_H #define MY_PROJECT_MY_HEADER_H /* Real content of the file */ ... #endif
The name of the macro should be something that is unlikely to clash with other macros; e.g., prepend the name of the your project.
C (and C++) have a fair number of operators, with strictly defined precedences. Even if you know all the precedence rules by heart, don't assume others do; so, please use parentheses when writing complex expressions.
Matlab
Matlab is a rather slow interpreted language, but it shines — as its name implies — at matrix manipulation. So try to avoid loops, and do operations in parallel on arrays. Many vectorized functions are built-in, so look in the help when you need simple operations like sums, means, maximum...