Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

Introduction

C++ has a lot of nasty corners that smell a lot as well as some remarkable things that can be used in some fairly suprising manners. One of the little understood corners of C++ is name lookup. Name lookup is something very basic but the basics seems to be the things that people skim over when learning C++ or just using it. Things like L-values and R-values are part of the basics and they are sadly little understood. And oh, btw if you really want to learn the basics, a tip is to pick up a book on plain old C. I've found that the ones written by compiler writers are usually the best ones. Partly because C was a language engineered to make it easier to write a compiler, but also since seeing it from the compiler's point of view usually deepens the understanding of the syntax and the transforms that are reasonable for the compiler. A highly recommended book is the Expert C Programming book by Peter van der Linden. Added to the fact that it is a good programming book on its own, it is also one of the funniest programming books out there.

But I digress. We were about to talk about name lookups. Normally we learn that names are looked up from the enclosing namespace and then if not found, the compiler will search the parent namespaces until if it's not found in the root namespace, an error is generated. This is a vast simplification, but it's also a generalization that might stick with the more basic C++ books. For most intents and purposes, this might also be perfectly fine for most programmers if you're not doing something insane. But understanding more about the namelookups might also lead to cleaner and more understandable code, less namespaces (!) and just less typing. Which is good, right?

Name lookup

Ha! Namelookups schmookups you say. How hard can it be? And what's the use? Well. For example, I like to name the argument variables something useful, for what it's used for as well as I try to keep public variables in my classes and structs non-prefixed. Which basically translates to the following code:

 
struct Array
{
	float* data;
	int count;
	
	Array( float* data, int count )
		: data(data)
		, count(count)
	{
	}
};
   

Looking closer to the member initializer list, we see that I initialize the "count" variable with "count". Hm. There is a slight ambiguity since there is an implicit this pointer, so do I initialize the this->count variable with itself or the argument count to the constructor? Turns out that the search order in the member initializer is to search the arguments to the constructor first, which leads the above code to do the correct thing despite looking kind of funny. Charles Nicholson first showed me this and I went "noooooo!" and after my horror debated, I kind of like it. It's a neat trick that let me worry more about naming the variables properly once instead of making up names for them multiple times. Or as I did before, postfix the parameters to the functions with an underscore as I learned from my time at Cyberloop in Uppsala, Sweden sifting through the code written by Magnus there. I think I more got hooked on the const parameters forced down my throat by Noel and Charles (they loved it):

 
int foobar( float a );

// Is not a new prototype, the const modifier for the parameter does not make it
// to the function signature, but prevents anyone inside the function to 
// modify it.
int foobar( const float a )
{
	return (int)(a*a)
}
   

ADL

Argument Dependant Lookup, or Koenig lookup is basically a fallback mechanism after the standard search is done to also look through the namepaces of the arguments to the function to find a match. Things become a little bit hairy now if you mix it with regular overloading and also if you mix in templates and partial specialization (if your head is not spinning at this point, you're a very brave person). That said though, it can be used for good, but only if you search your feelings (I had to throw that in there :). The goal of the usage here is to make the code more compact and less cluttered. Say for example that you're writing a modern math library, one which is centered around float quads like this:

 
struct vec4
{
	float e[4];
};
   

A classical approach that people learn from standard textbook is that if you want to implement a operation on this type, you do that through message passing to the object. In C++ that translates into a member function, since the function is a message passed to the object. All that magic talk aside, the code for a dot product would look something like this:

 
struct vec4
{
	float e[4];
	
	float dot3( const vec4 b ) const
	{
		return e[0] * b.e[0] + e[1] * b.e[1] + e[2] * b.e[2];
	}
};
   

There are several things that stand out from this implementation. We're going to sidestep the whole issue of vectors, points and 3D v/s 4D, since that is a looooong discussion. Suffice to say that we want to store all our vectors uniformly and potentially "waste" 4 bytes for all our 3D vectors. We've implemented a simple dotproduct between two vectors as a member function to the vector type. The usage of the code becomes a little bit fishy though:

 
vec4 a,b;
float d = a.dot3(b);
   

It doesn't look too "math" like here, does it? Ok, it's not too bad looking I guess, but the compiler is still going to groan a little here due to aliasing of the this pointer. In short, the asm code that comes out of this is not the best. Also consider if we want to implement the cross product which is not commutative and the order is suddenly important. Looking at the following code:

 
vec4 a,b,c;
c = a.cross(b);
   

It might be construed as we're doing a x b here, but it can also be seen as we're sending a message with b and an operator (the cross product) and applying it to a. In which case the order suddenly isn't explicit. In fact, calling member functions can be seen as sending everything from the right to the function name and then sending that to the instance, which really is reading right to left. We're in the western world very used to reading left to right, but it's not really something given especially thinking of object orienting in terms of messages and operators. Ok, this might be a little bit far fetched (even I'm a little bit spaced out now). It really becomes more interesting if you extend this whole thinking to matrices, vectors and consider column v/s row vectors. Say we have now a corresponding matrix class looking like:

 
struct mat4
{
	float e[16];
	
	vec4 mul( const vec4 b ) const;
};
   

Now, if we have column vectors we would write a matrix multiplication between the matrix M and the vector B like M * B. In which case, if we're reading left to right, it would look ok with the member function mul in the matrix class here. But if we're defining row vectors, it would be more natural to write B * M. In which case we probably would want to have the mul member function in vector. And now we're entering madness. Really the member function paradigm doesn't really work here for us, maybe we should just ditch the whole C++ thing and instead define public functions in the parent scope? Something like the following would probably do:

 
struct vec4
{
	float e[4];
};


struct mat4
{
	float e[16];
};

vec4 mul( const mat4& a, const vec4 b );
float dot( const vec4 a, const vec4 b );

   

It also kind of makes the whole order problem go away since in plain C we're very much used to reading the arguments left to right (it also happens to the be the standard C calling convention layout in memory, assuming that the stack grows towards lower addresses. Here is a nice though, what would the implications be of the stack growing upwards be (it's actually a good thing) ?). It becomes a much nicer notation now in the code, as it looks more like mathematical notation with functions and operators attacking arguments.

The problem we have with the free functions above is that we are going to wind up with a lot of very short names, like "add", "mul", "dot" etc in the global namespace. Namespaces were introduced to solve nameclashes, and they are really godsent if your dealing with third party libraries as well as a large local codebase. Agreeing that namespaces are solving a problem, we might want to use it for our little math stuff here, but we run into a little problem when we try to use the library. Going from the code looking like the function nice1 we go to bad2:

 
float nice1( const vec4 a )
{
	return dot(a,a);
}


float bad2( const math::vec4 a )
{
	return math::dot(a,a);
}
   

The single scoped dot might not seem like a lot, but expand this out to more complex mathematical operation and you will suddenly wish that you named your namespace just "m". And you still have to type the silly two colons. Or do you? Surprisingly the following compiles as well:

 
float nice2( const math::vec4 a )
{
	return dot(a,a);
}
   

Ok, that's slightly better. We still have to specify the full name of the data type, but are spared the hassle when we want to use it. It's actually pretty nice. Of course the compiler finds the dot symbol through ADL as you might have guessed. This causes us to be able to write the math library completely inside the namespace "math" for example, and then use it through minimal references to the namespace. You might also be tempted to start introducing "using namespace math;" all over the place. Do not. Really, do not. The whole using namespace is a crutch for lazy programmers who are out to negate any benefits coming from the namespaces, since it just pulls in everything from the particular namespace and then you have to deal with possible compiler abiguities. The code will look out of place and suddenly you don't know which one of the three vec4 implementations you are supposed to use, and the compiler is stymied and core dumps (ok, it might be nice and print an error message before the core dump, but I think the core dump is appropriate). So the final implementation of our (incomplete) math library would look something like this:

 


namespace math 
{
	struct vec4
	{
		float e[4];
	};


	struct mat4
	{
		float e[16];
	};

	vec4 mul( const mat4& a, const vec4 b );
	float dot( const vec4 a, const vec4 b );
}
   

Further optimizations

So the observant reader noticed that I mixed call by reference and call by value in the above code. On modern processors there are often a native type of four single precision floats a quad float or a SIMD register. These can often be passed in registers instead of on the stack, which makes it a bad thing to force the compiler to pass a pointer to them and then load it from memory (since we can't take a pointer to a register). The MSVC compiler does some optimizations here for inlined functions to minimize the traffic if the register variables are already in registers, but for non inlined functions nothing can be done. It becomes even worse on the PPC consoles today where a load from and address which we just did a store to is an instant penalty (e.g. the PPU on the CELL -- LHS). A simple optimization here would be to allow the compiler to hold the vectors in registers instead:

 
namespace math 
{
	typedef vector float vec4;

	struct mat4
	{
		vec4 e[4];
	};

	vec4 mul( const mat4& a, const vec4 b );
	float dot( const vec4 a, const vec4 b );
}
   

This will allow the compiler on for example CELL to hold things in registers longer even across function calls, which will greatly increase performance since the memory traffic to the stack is reduced. Another important optimization here is to change the signature of the dot to not return a float, but rather a vec4. On most processors bouncing between different register sets (like the SIMD and scalar float) can only be done by first storing the contents out to memory and then reading it back in again (and you remembered the LHS here, right?). One of the very nice things about the SPU is that it does not have this problem as it only have vector registers and applying scalar instructions to them can be done with ease. So the declarations would look more like this after that small change:

 
namespace math 
{
	typedef vector float vec4;

	struct mat4
	{
		vec4 e[4];
	};

	vec4 mul( const mat4& a, const vec4 b );
	vec4 dot( const vec4 a, const vec4 b );
}
   

Don't use it

So we've actually used some pretty advanced C++ here and it will seem pretty strange for me to say this, but try to stay away from the more advanced things in your code. Use it where it makes sense and not further. One of the most abused features of C++ which I try to stay further and further away from is operator overloading. It's one of those syntactic sugar things that are very seductive but can land you in a whole world of pain. Consider for example in our example the math classes like the vector. It would be extremely tempting to implement operators like plus, minus, multiplication, conversions to float and indexing. Of the mentioned, I would only implement plus and minus and that reluctantly. Those are the only ones that won't get you into too much trouble and even so, an add and sub function will do the same thing and you won't loose to much readability while maintaining coherency with the rest of the functions. Why will the other operators get you into trouble?

Multiplication

Many have tried and all have failed (and some not realizing it). One of the first rules of overloading operators is to not do anything unexpected. So don't implement addition in the division operator for example. As for the multiplication operator, depending upon your background, what does the multiplication operator imply on two 4D vectors? If you're a Cg guy you might say it's component wise multiplication, if you're a math guy you might venture "inner product?". Some might even say a cross product. None is the correct answer. There is none. And that's the problem, since there are several views on this, don't use it. Now if the user is presented with functions called: mul, dot3, dot4, cross they might have a better chance of figuring out what the heck the code is doing (I'm still not in love with the mul function, but it does make sense when you think of it in terms of asm instructions -- there is a vector mul that does component wise multiplication). Anyhow, stay on the small and narrow road and don't overload the multiplication operator.

Conversions

Automatic conversion operators are usually a very bad idea. Actually, I can't think of a single instance where they are "good". As for conversion operators for math, shudder. We had this case where our code didn't do the right thing and reading the source several times didn't yield anything as well as rewriting the equations with pen and paper. Really desperate we single stepped through the entire codeblock several times until we finally noticed that one line produced very weird results that should not really be possible. The line read something like:

 
float a;
vector4 b,c;
c = b / a;
   

We simply wanted to scale our value (although somewhat naively and slow). However what really happened was:

 
float a;
vector4 b,c;
vector4 d = vector4(a,a,a,a);
c = b / d;
   

There was a conversion operator defined that took a single float and replicated them into all slots of a vector4 (a single value constructor). And then the division operator was defined as taking two vector4s and doing a cross product. Needless to say, the result that came out of that single line was not even close to what we were expecting. But it looked innocent, right?

Index operators

It might be really tempting to be able to index a vector4 through it's components through the regular index operator. And since this snazzy thing called C++ has operator overload it might be tempting to take it out for a spin and try it out. Stop. Remember when we talked about the cost of moving values from one register set to another? Basically you need to bounce the value off the stack and then back again, occurring massive penalties. Now, is it really a good idea to put this performance hog in the hands of every programmer out there to do with as they will? I don't think so. Instead make two functions:

 
vec4 insert( float value, int index );
float extract( vec4 value, int index );
   

The above functions can be used to insert and extract single scalars into vectors. Coupled with load and store functions for the entire vectors this should provided ample functionality for the low level programmer. Usually, most math should already be in pure vector form and not suffer from these little conversions. The functions also have a little bit of social hacking inside them if you comment them and say that this is slow. Using an index operator is so ingrained for programmers that they really don't think about it in terms of performance. And even if they do, they might think that they already have paid the price to get the value into registers and it's very local. I see it on the same level as when people do casts from float to int all over the place. It's so natural and nice in C on the x86 platform that we forget that it's really horrible on today's consoles.

In closing

That was a very small rant about writing C++ but without member functions or classes, but still using features that wouldn't be possible in plain old C. ADL can be a very powerful tool to help you arrange your code better and still get away with writing less characters and avoid the fate of the cobol fingers. I also could not resist some of the jibes against some of the less stellar things I've seen in code :) And I'm not saying either that I have not written my fair share of bad code during the years, some of the mistakes here has been discovered through the painful way of experience. And my coding style certainly have changed a lot over the years, I recently took a look a one of the vector classes I wrote a long time ago during school. It's actually kind of funny, since it outlines and exemplifies most of the things I've recommended against as well as just some crazy things in terms of math, encapsulation and library dependencies. Simply put, mostly madness :) Here it is, have a laugh on me:

 
//----------------------------------------------------------------------------
// CVector3
//----------------------------------------------------------------------------
class CVector3
{
#ifndef NDEBUG
  friend std::ostream &operator<<( std::ostream &os, const CVector3 &v );
#endif // NDEBUG
  
public:
  CVector3();                                                   /// normal init
  CVector3( float a );                                          /// init all members to a
  CVector3( float a, float b, float c );                        /// init members
  CVector3( float *p );
  ~CVector3();                                                  /// empty destructor

  CVector3 &operator*=(const float &r);                         /// multiply all members with r and assign
  CVector3 &operator/=(const float &r);                         /// divide all members with r and assign
  CVector3 operator*(const float &r) const;                     /// multiply all members with r
  CVector3 operator/(const float &r) const;                     /// divide all members with r
  CVector3 operator*(const CVector3 &r) const;                  /// cross product
  CVector3 &operator+=(const CVector3 &r);                      /// add two vectors
  CVector3 &operator-=(const CVector3 &r);                      /// subtract two vectors
  CVector3 operator+(const CVector3 &r) const;                  /// constant versions of the two above
  CVector3 operator-(const CVector3 &r) const;                  /// they do not return references to this
  CVector3 operator-();                                         /// unary negation, ie reverse the vector

  bool operator==( const CVector3 &r ) const;                   /// memberwise check for equality

  float norm() const ;                                          /// returns the norm, ie lenght of the vector
  float dot(const CVector3 &r) const;                           /// returns the dot product
  void normalize();                                             /// makes the vector a unity vector

  CVector3 rotate( const CVector3 &vAxis, float fAngle ) const; /// rotate the vector around any vector
  float &operator[](int i);                                     /// returns the members, x=0, y=1, z=2 for iterator-like access

public:
  float x,y,z;                                              
};
   

Funny enough, we did put together a lot of projects with this vector. Some fairly large and complex. Which just shows that with enough persistence you don't really need "good" code, or even bug free code. But it sure nice to have something that's simpler and lets you go home and not search for that magic automatic conversion to vector and then do the cross product bug ... Until next time, cheers!

Comments