Aurora: Writing portable code

Introduction

Here are some tips for writing portable code. Some of them may apply to your project and some might not, think before you choose and if you're going to make any "stupid" decisions, make sure that you are making them deliberately and with open eyes. It might be ok for your project. And of course if you're never ever cross your heart and hope to die going to port your code to any other platform than the current (you lucky dog) then just disregard everything in here and go home earlier and enjoy a cool beer on the balcony, in short do something better than sit in the office all day.

Conform to the standard

If you're primarily a visual studio developer, you are probably without knowing it using a lot of the Microsoft extensions to the language. This is one of the cumbersome sources of errors when you try to port to a different platform since a lot of code might need to be changed due to it. The best thing here is to at an early stage identify that you want to be standards compliant and code more standard conformant code by disabling the Microsoft extensions by default (the -Za compiler option for msvc). The problem we have here is that the windows headers are not compiling cleanly without the extensions turned off though. We can however make the default mode be extensions off and only enable it on the actual translation units that need to include the windows headers. This subset should really be a small subset anyways since we are striving to make the whole codebase portable. This has the added benefit that there is an extra check for spurious use of the expensive windows.h header, it will just be a compiler error. Great! Now nobody can just accidentally add an include of the windows.h header and you have to protect all use of the types from e.g. direct x with pimpls and none of these types can leak out to the global interface. This will have the added benefit that it will decrease your complexity and decrease your compile times since not all files will pull in the expensive headers.

 

--- .h file ---

class Foobar
{
public:
     Foobar();
     ~Foobar();
     void calculate();
     
private:
     class Pimpl;
     Pimpl& m;
     // No visible members, sizeof(Foobar) is aprox 
     // sizeof(void*)
};
     
--- now the .cpp ---

class Foobar::Pimpl
{
public:
     // expensive type pulled in from
     // e.g. windows.h
     HANDLE handle;
};

Foobar::Foobar( m( *new Pimpl() ) )
{
     m.handle = 0;
}

Foobar::~Foobar()
{
     delete &m;
}

void Foobar::calculate()
{
     // do expensive stuff with m.handle.
}

Listing 1: Example of the pimpl idiom.

Compile on a different compiler

Chances are that if you want to be portable you need to give up on your current compiler at some point, unless you find that gcc can be used for all your platforms with adequate performance and code output. Some of us are not that lucky though and find ourselves with the requirement of different compilers on different platforms. Compilers like MSVC, gcc and Metrowerks Codewarrior are common today at least in the games industry. All of them have quirks that we learn over the years, but most of them differ from compiler to compiler. So how do you choose the least common denominator? The answer lies in the document named ISO/IEC14882. It's the international standard for the C++ programming language. It's the blueprint that most of the compiler vendors are trying to follow. When in doubt, this is the source of your answers.

Now the standard itself takes a little training to read. It's not something that you just read back to back (though you wonder at some people sometimes, they're amazing at dredging up five scattered references that makes you go "doh, now I see it"). Even though it's a really old book, The Annotated C++ Standard, is a surprisingly useful book that is highly recommended. If nothing else it teaches you how to read the real standard. One of the books that there should be at least one of in your organization if you're using more than one C++ compiler. Once you've ploughed your way through that one you are ready to order your own copy of the actual standard. This is one of the document your company should have if they are programming in C++. It's a little bit like not having the manual to a really complicated machine and using it daily. The really inexpensive option is to buy an electronic copy, which really is good since you now can search in it. The ANSI/ISO standard can be bought electronically here. There is also the original document from ISO here, but as you notice it's much more expensive. I have yet to figure out what's the different. For those that just want to look how the standard document looks like, you can check out the old public review document.

So now you're armed with the tools you need to really understand why the compiler sometimes refuses to compile your code. One of the things I use to do when in doubt is to create a really short test case that I can compile with most of the compilers and then also try it with the Comeau compiler that is available online. It's a really good compiler when it comes to standards compliance. If your code passes compilation with this compiler it's a safe bet to assume that your code is standards compliant and go on.

Compile at the highest warning level

Compiler writes are usually more in tune with the language and the process of transforming the language from source code to assembler code than us mere mortals. When they deem it important enough to put in a warning message when the encounter things, it's generally a really good idea to review it and make sure that you really understand it before reviewing it again. Then you may dismiss it. Now there is another way, and that it to simply stick your head in the sand and pretend that there are no warnings by telling the compiler that you want to compile with warnings disabled. This can be seen as trying to land a Boeing 747 in San Francisco airport a foggy day and disabling your instruments. In both cases a crash is a highly probably outcome.

Now if warnings are so good, why would we not want as many as we can get? Good question. Well that's because... hmm... it's tedious to fix your warning ridden code? Not that good of an argument perhaps. Compiling at the highest warning level takes some juggling certainly, you will come across a lot of warnings that will be easily dismissed just as sloppy coding that doesn't really impact anything -- that is you could have left the code as it was and it would probably not impact anyone. But ever now and then you encounter warnings that really are eye openers and directly point at a latent bug (or even an active bug). Now I would argue that the time it takes me to fix all the annoying warnings and the actual bug level warnings is shorter than finding the bugs without the compilers help.

Run through lint

One of the biggest tragedies in compiler evolution was the decision to take lint out of the compiler in order to make the compiler simpler and faster. Running the code through a lint session will often reveal a lot of problems with the health of the code and might even expose outright bugs. If you're running on a traditional UNIX system with plain old C, you're in luck, lint is probably available for your source right out of the box. If you're running a more common configuration like Visual Studio for windows or xbox and writing C++, you're in for a rougher ride. There are tools for C++ as well in those environments but it's not as easy to run them as just add another step in your compilation rule in the makefile.

That said, it is an extremely good idea to keep up with the practice to lint early and lint often. Running lint on an older codebase can result in a shock, it's not uncommon for several thousands of warnings to be generated. Tracking down each one will take time. The sad part is that most likely it will be worth your while and you will find bugs in your code that you could have spent significant time to find otherwise. The argument that this will take too long time and it's not worth it is kind of weak if you don't accept buggy code. Now if you do, and can get away with shipping a substandard product, sure go ahead and toss lint in the bit bucket. Me for one would have a hard time to swallow that since I do have a professional pride as a programmer and want to deliver bugfree code. Facing reality though I know that I will write bugs and I will need all the help I can, if a tool can find the bugs for me, all the better.

Have one point of platform selection

So how will you detect in the code that you're on a certain platform? There a re several options, the one that I've employed in the past with success is the single custom define for a certain platform on the command line. Say I'm compiling tools on windows, I'd make sure that -DAURORA_TOOL is passed on the compiler command line.Here I make sure that all my macros are prefixed by my simple name, in this case AURORA_, and that the platform is described. Since I've got the convention not to make any tools on any other platforms than Win32, I'd just pass TOOL in. The definition of AURORA_WIN32 is the runtime platform for actual games prototypes on the Win32 platform. The tool and the win32 platform differ in what kind of code gets compiled in, calling conventions and other options like RTTI, exception handling etc. Refrain from relying on any other defines on the compilation line since it will just get really cluttered and hard to follow. Why even put this on the command line? Why not use the predefined compiler macros to find out what platform you're on? The argument against that is that it's not really future proof against swapping out compilers on the same platform and it creates a dependency on the specific compiler/define combination that can spread across the sourcebase like a virus and be very hard to revert. The other reason why to put it on the command line and not in a file somewhere where you can switch on the predefined macros is that depending on the include chains, you might miss the define in one header and where you typically missed to include the config file:

 
// This is a typical platform dependant header file.
// Danger danger, we missed to include
// #include "PlatformDefines.h"
// where all the switches are defined, e.g. AURORA_WIN32
#ifdef AURORA_WIN32
     // Do some windows specific code.
#else
     // Just ignore the code segment here since we don't use
     // this piece on the different platform
#endif

Listing 2: Foobar.h, Shows how you can wind up with a catastrophic header that depending on who includes you and what kind of defines they've pulled in from other headers have different object code emitted.

Say you've done the thing in listing (), and then wind up including the PlatformDefines.h through an include chain before you include Foobar.h in one translation unit and then include Foobar.h differently in another translation unit where the PlatformDefines.h was not pulled in. Now you're that brown creek with no paddle, this will be a very hard thing to find.

Make the unsupported case an error

Have you ever run across the following construct?

 
#ifdef _WIN32
     // Do some windows specific code.
#else
     // Just ignore the code segment here since we don't use
     // this piece on the different platform
#endif

 
#if defined _WIN32
     // Do some windows specific code.
#elif defined _SOLARIS
     // Do the solaris eqvivalent
#else
     #error "Undefined platform"
#endif

So the above construct is certainly much more cumbersome and people might balk at it at first. Hey, that's so much effort for something fairly trivial. Well, it's platform dependant code. I'd argue that it's good that it's cumbersome to write this and that it looks ugly, that makes it a little threshold so that you don't start to go nuts and insert platform dependant code all over the codebase. Why would this be a bad thing? For one thing, we got the whole merge issue when working in a source controlled environment. A merge issue in actual code that's sequential is pretty easy to spot. A merge issue within multiple #ifdef blocks can potentially be hell to find and will manifest itself at the worst time (can you spell 5 minutes before a demo for the VP?).

Notice the clause in the end, the #error one? It's a good idea to put this in if you ever in the future anticipate a new platform being added. Heck, put it in if you don't anticipate it, since change is one of the few things being constant, it's a safe bet that a new platform will be added in the future. Having the #error statement will make it a breeze to add a new platform an quickly be up and running on that thing, simply try to compile all the source in the new platform and then fix all the #error statements.

Use platform forward headers

So how do you design a class that's supposed to encapsulate a platform specific behaviour? Well, let's take a look at some very simple cases, say for example measure time and sleep. Well that's pretty easy in Win32, you use the Sleep() function and the QueryPerformanceCounter/QueryPerformanceFrequency functions. But you'd want to stay away from exposing them to the outer world, so we want to encapsulate them inside some functions that hide the signatures and necessary types.

 
namespace aurora {

void sleep( size_t microsecondsToSleep );
uint64_t microsecondsSinceStart();

}

Listing 3: The signatures for the sleep function and the time function.

I might put this in a header called Time/Time.h in the library called Time, which for now is very simple, just this single header and probably just one little translation unit. You can study an example implementation here. The Time/Time.h header does not however contain any code, just a couple of #ifdef statements that depending upon the platform defines discussed earlier will include the appropriate headers. The different headers will all define the same interface, which will be ensured by the unit tests which will be the same for all the different platforms (so if someone changes the interface for just one platform this will no longer pass the tests on the other platforms).

Portable folders in the sample Time library

Fig 1: Directory layout in the sample time library.

In closing

We've covered some simple steps you can go through to make sure that your codebase has a higher chance of being portable than just writing in environment with one compiler. Unfortunately some of the things here like the #ifdef regime and the running of lint takes discipline and it's easy to slip with it. Most of the gain is lost if you slip which makes it a little bit hard to follow. But boy, once you do the gains and easy you have to add another platform to your codebase is all worth it. What we have not gone into are the things you can do to minimize duplicate code in the platform dependant files. Techniques like policies/mixins can be useful, as well as defining interfaces (if you're not worried about performance) to these classes can be employed. But I'm going to save that for another day, another topic and another article. Until then, happy coding.