The art of coding : good practises

Overview

This document lists some common rules and tricks that may save a few hours of painful debugging to developers. It is still work in progress.

Coding standards

The GNU project released its own rules, available here. Even though some recommendations would be difficult to enforce or would not correspond to the context of the Ceylan developments, these guidelines are mostly relevant. All developers should read them once! We found these sections especially interesting:

We were less convinced by some other recommendations, about CamelCase, formatting for Emacs, etc.

Specific topics raised by these guidelines

Change Logs

Apart from the GNU point of view about Change Logs, there is a need for tools that automate their generation from the versioning system being used. [More infos, in French]. As we are using CVS, cvs2cl seems to be the way to go.

Code formatting

The indent or astyle tools could be used as well, to ensure formatting consistency.


[Back to the table of contents]



Random advices

Programs that use malloc ought to #include <stdlib.h>. On 32-bit systems you can get away with not including it, but on 64-bit systems it will assume malloc() returns a 32-bit integer and butcher the 64-bit pointers it returns.

How to stop cleanly a thread? Make it regularly check a boolean stopDemanded.

Use systematically and regularly Valgrind.

Valgrind is a GPL'd system for debugging and profiling x86-Linux programs. With the tools that come with Valgrind, you can automatically detect many memory management and threading bugs, avoiding hours of frustrating bug-hunting, making your programs more stable. You can also perform detailed profiling to help speed up your programs.

The throw() qualifier in function signature should never be omitted, and fully specified, to avoid the unpleasant Aborted at runtime.

Always remember that the const qualifier comes before the throw qualifier, for example: void myMethod() const throw( SomeException)

Never throw an exception from a destructor:

The first line of an implementation file X.cc should begin with #include "X.h": this allows to check directly that X.h does not depend on any includes that could be defined in X.cc, and therefore that X.h is really autonomous.

Why using functors? A Functor allows us to wrap up a function inside an object. When a convenient function (ex: a STL one) takes a predicate (a function returning a boolean value when it is given a particular instance type), but the evaluation of the function requires dynamic informations, passing only a classical function won't do, since it cannot be adapted to fit with the neccasary dynamic behaviour. Instead, if one uses a functor, this object's constructor can be given those run-time informations so that it build a particular evaluation of the predicate. Afterwards, in the convenient function, instead of specified a simple predicate, one can use the functor, whose operator () has been redefined: this way, it behaves exactly the same way as the predicate, with the advantage of being dynamically defined.

An example demonstrating the usefulness of functors is when one needs a predicate which returns whether the specified object is equal or not to a particular one. The former object, which is not known at compile time, cannot be expressed thanks to a pure predicate. Instead, the functor's constructor can be given this object, so that its () operator will return true if and only if its argument is equal to that particular constructor-provided object.

When adding or changing a source file, please use four-space tabulations, and format code and comments so that they fit in a 100-character wide line.

The only language to be used in source code is english.

We recommend a text editor such as nedit.

In a conditional expression, for example in if ( aVariable == aValue), maybe reversing the two members (if ( aValue == aVariable)) could be safer, since if the equality operator (==) is by mistake replaced by the assignement operator (=), the compiler will complain. When possible, avoiding lvalues on the left part could be a good trick.

In C++, instead of

int * myPointer = static_cast(::malloc( sizeof(int) * numInt ) )) ;
prefer
int * myPointer = new int[numInt] ;
which is much simpler and safer.

Dealing with multi-dimensional arrays is often tricky. Using malloc makes it nasty. If you are using C++, you should really use new. That would catch many errors at compile time.

A general rule of thumb is that there should be no malloc nor free in a C++ program, except when using a C library, including of course the system libraries.


[Back to the table of contents]



Some common mistakes

C++

Linking

When linking statically, the order of the object files and libraries is critical. A static library satisfies references only for the files before it on the linker command line. This is normal behaviour, and it means that you sometimes have to have some libraries repeated in case there is a circular dependency, like -lfoo -lbar -lfoo.

The reason for this is that the linker (ld) reads the linked files in the order of the command line, and when linking statically, references are linked at that point. When linking dynamically, the symbols are only checked to see that all references are satisfied.

Testing logical values

Wanting to test a flag against a binary mask (most of the time, a number with only one binary digit set to one)? First idea would be to write the following (correct) form: if ( flags & UseVideo ).

What I do not like in that case is the (implicit) conversion from a number (flags & UseVideo, which may be equal to any non-null value) to a logical value (true or false). I think that in that case C and C++ are too low-level (weakly typed), which results in misleading (unwanted) conversions, such as taking a pointer as a boolean value with no warning at all (with numerous constructors with default values, who never was bitten by this?).

A form I would prefer is thus: if ( ( flags & UseVideo )!= 0 ). Each parenthesis is needed, since a mistake with operator priority can occur easily: the following form if ( flags & UseVideo!= 0 ) is incorrect, since it is interpreted as if ( flags & ( UseVideo!= 0 ) ). Beware!

STL

See C/C++ Reference for informations about the STL API. Please keep in mind the STL is complex, wicked and evil.

When iterating in a list, the first elements are at the front (and the last are at the back).

For example:

 vector<int> v ;
 for( int i = 0; i < 5; i++ )
   v.push_back( i ) ;

 cout << "The first element is " << v.front()
	  << " and the last element is " << v.back() << endl ;

This code produces the following output:

The first element is 0 and the last element is 4
.

Never modify a container (ex: a list) while iterating on it. Even if it sounds obvious, it can happen when calling in an iteration an object which triggers a list modification (ex: iterating on a list of event sources to unsubscribe a listener which in turn remove them from the listener list, etc.).

Even such simple code is incorrect:

void EventListener::forgetSource( EventSource & source ) throw( EventException )
{

	for ( it = _sources.begin(); it!= _sources.end() ; it ++ )
		if ( (*it) == & source )
			_sources.erase( it ) ;
}
One should use _sources.remove( & source ) ; instead, or remove_if, since erase invalidates the iterator.

In general, the structure of the list (number and order of elements) should not change while using an iterator on it, only the elements may be modified. However, on many cases, we need to remove elements while iterating on a list.

For example, in a Surface/Widget framework (see OSDLSurface.cc and OSDLWidget.cc), the surface is an event source, and the widgets are event listeners (to simplify, let's forget for a while that widgets are surfaces too, so that they can themselves contain widget recursively). If the surface is to take ownership of its widgets, then upon deallocation, the surface will have to delete its widgets, which are its listeners. However, in the underlying source/listener framework, the listeners should normally unsubscribe from the source, since often the source is often unable to know when to detach them.

In the Surface/Widget framework, the source is to trigger the separation. The following simple solution would be nevertheless be wrong:

Surface::~Surface() throw()
{

	// Just unsubscribe this listener widget from this source surface:
	for ( list::iterator it = _listeners.begin(); it!= _listeners.end(); it++ )
		(*it)->unsubscribeFrom( *this ) ;
}

Why? Because the Widget::unsubscribeFrom method causes the surface to be called so that the corresponding widget is removed. Therefore the listener list changes while iterating on it, which leads directly to a nasty segmentation fault with very little information, even in the debugger.

So, how to overcome this difficulty? Use as frequently as possible the correct following form instead, with no iterator:

Surface::~Surface() throw()
{

	Ceylan::EventListener * widget ;

	while (! _listeners.empty() )
	{
		widget = _listeners.back() ;
		widget->unsubscribeFrom( *this ) ;
		delete widget ;
	}

}

However, on some cases one does not want to destroy the whole list, but wants to iterate in it until some specific elements to remove are found. The solution can be to use multiple passes onto the list, and/or to use cautiously remove.

No iterator (no STL?), no cry!

Python

The mutable default parameter is, in my opinion, one of the worst pitfall [See detailed explanation].


[Back to the table of contents]




Please react!

If you have information more detailed or more recent than those presented in this document, if you noticed errors, neglects or points insufficiently discussed, or if you would like to contribute and help us, even a little bit, drop us a line!




[Top]

Last update: Monday, January 3, 2011