Working with C++ templates can be dangerous for your coding style. Generic containers and domain-ignorant algorithms are a godsend, and as a game programmer with concerns for performance, any opportunity to push serious work at compile-time is a win. However, I've recently found out how thinking too much in templates can make you regret.

Image

 

Apart from offering an extra level of abstraction, templates have some nice "language technical" uses. Complicated loop unrolling and evaluation of recurrences (pic) are the common ones, with matrix multiplication or dot product unrolling and Fibonacci calculation being the (rather lame) canonical examples. A technique that is very often useful, but for some reason doesn't get mentioned so often perhaps due to its simplicity, is the ability to inject conditional code (independent of the loop variable) inside a tight performance-critical inner loop. Without templates, one would have to use macros, copy-pasting of code or function pointers, a decision which is a tradeoff between clumsiness and suffering a performance hit. With templates, you can avoid both.

Image

For example, in my fractal painter program I do millions of iterations on points on the complex plane per frame. Instead of having almost duplicate IterateMandelbrot, IterateJulia, etc., I let the compiler do the copy-pasting by writing Iterate<T> and letting the compiler instantiate Iterate<Mandelbrot> and Iterate<Julia>, where Mandelbrot and Julia are either functions that the compiler will inline or function objects with some common semantics. Now in fact, sometimes I can work with even something simpler if I only need very basic flow control, e.g. imagine SoftwarePolygonRasterize<bool writeZ, bool writeColor>();.

As a typical algorithmic abstraction example, let's look at the following. In my raytracer I have a loop that traverses a kD-tree, testing intersections between a ray and a set of triangles. In some cases I need to find the nearest intersecting triangle along the ray's path, but in some cases only checking the existence of any intersecting triangle suffices. With templates I don't have to write two different traversal functions and I don't have to put an 'if' inside the time-critical loop. Instead I do something like this:

template<typename LeafProcess>
void TraverseKdTree(const Ray &r, LeafProcess &leafProc)
{
   for(each Leaf l of the kdtree along the path of r)
   {
      bool hit = leafProc.Process(r, l);
      if (hit)
      return;
   }
}

class NearestIntersection
{
   Triangle *nearest;
   float nearestHitDistance;

   bool Process(const Ray &r, const Leaf &l)
   {
      nearest = 0;
      hitDistance = floatMax;
      for(each Triangle t in the leaf)
      if (r intersects t and hitDistance < nearestHitDistance)
         nearest = &t, nearestHitDistance = hitDistance;

      return nearest != 0;
   }
};

struct AnyIntersection
{
   bool hit;

   bool Process(const Ray &r, const Leaf &l)
   {
      hit = false;
      for(each Triangle t in the leaf)
         if (r intersects t)
            hit = true, return true;
      return false;
   }
};

Image

Well that's of course all too basic, but still thought it deserves a mention. I can even pull any additionally required data from inside the traversal routine by adding new struct members, without having to disrupt the functionality of TraverseKdTree.

So what is the dangerous stuff then? Well, once you get too comfortable with templates, mistakes start happening. There are two issues that I have noticed.

1) The biggest practical problem with templates is the exploding compilation time. You end up having large proportions of your code lying in templatized inline header files. It is very easy to start overgeneralizing your system, i.e. writing something like

/** @param DiffEqn The y'=f(x,y) to solve. Must have signature Real Func(Real x, Real y);
      @param CorrCtrl The unit that controls the number of correction steps to take. Must have signature bool Func(Real x0, Real y0, Real x1, Real y1); Returns true if a correction step needs to be taken, false otherwise. */
template<typename Real, typename DiffEqn, typename CorrCtrl>
Real PredictorCorrectorDiffEqnSolve(Real x0, Real y0, Real step, Real xTarget, DiffEqn &eqn, CorrCtrl &corr);

when if what you actually ever needed really was:

typedef float Real;
typedef Real (*DiffEqn)(Real x, Real y);
Real PredictorCorrectorDiffEqnSolve(Real x0, Real y0, Real step, Real xTarget, DiffEqn eqn, int numCorrections);

The build times start telling you if you're getting too template-happy. At some point I was reasoning "I might find it useful to use some other type here too => I'll make it a template", whereas I now think "... => I'll separate the type as a typedef so I can refactor it later if needed". Having Quaternion<double> lying around just in case is fancy, but if you never need to co-use it with Quaternion<float>, then what's the point? It's a lot faster to have that typedef for making the switch if necessary. So when writing that template, first ask yourself why not do a typedef?

2) Sometimes you can find yourself imprisoned in the "compilation-time" sector. You started with a design that seemed all clean, flexible and robust because of "solid" use of templates. But then you notice that this template flexibility is actually what is limiting you. For example, it can be that you observe the templatizations "leaking" to classes and functions that you didn't first intend. That is like "To make my GoodnessOfFit -function work with RandNumGen<LCG> and RandNumGen<MT>, it seems I must also make GoodnessOfFit a template". When this happens, it is usually the result of having avoided creating that common base virtual interface class when it just was inevitable.
Another way to hit this misdesign is to change a constant variable to a template parameter, when what you actually would have needed was a fully dynamic variable. I've recently been writing a solver for the Tetravex puzzle, which involves generating lots of permutations. To avoid having to dynamically allocate memory for any temporary permutations, I made the permutation size a template parameter, Permutation<int Size>, so they all fit nicely onto the stack. The solver multiplies permutations over 30M times per second, so it looked like a good choice performance-wise. But by making that choice (and not just choosing a vector<int> for example), I have limited myself to a compile-time determined board size! Wait, I know what you are thinking - a constant is a constant, be it statically written or determined by instantiating a template? Well, yes, but the core point is that now I cannot as easily go back to make the change, since I have a lot of code based on the decision I made. That decision is etched on every single angle bracket < > that exist for that particular class. So when you're changing a constant to a template parameter to extend some functionality, be sure that you're not actually just imposing more restrictions for later on.

That's about all the observations I have. Umm, what does the title image of my favorite actress got to do with any of this? Well, I thought it would be nice to put such a pretty picture in to color the topic. Actually that is an image from my slide puzzle solver, 2007-01-28.Win32.DX9.NPuzzle.bin.zip, that I wrote some time ago. The way it relates to templates is that the A* searching algorithm that is being used there is strictly separated from the problem itself, so I should be able to utilize the algorithm elsewhere by just writing a new problem statement. Whether it was just another dumb template design, I suppose I'll see when I get such a new problem.

Last Updated (Friday, 07 December 2007 21:15)