Don’t Misunderstand
My feelings about templates are often misunderstood – particularly by those who have known me long enough to remember the days when I swore that C++ had no place in video games. That was when we had to fit the whole game into 2 megs of RAM, and we were doing things like squeezing the entire stack onto a 2 k scratch pad. Those days were a long time ago. Things have changed, and I have no desire to go back.
These days, C++ is the language of choice, and with that comes templates. Template are powerful tools, and it would be foolish to dismiss them. The opinions expressed here are not intended to discourage their use, but more to temper the exuberance with which many developers have embraced them.
The Good Stuff
Let me begin with some good reasons to consider templates as part of your software architecture.
The golden rule of programming is, “Don’t write the same code twice.” I probably don’t need to justify that statement, but much to my amazement, I have run across some disagreement on that point. The theory is simple. Every time you repeat the same piece of code, you introduce a new chance of an error. If you instead re-use a single function, any potential bugs in that function can be fixed in one place.
Templates allow us to carry this concept to the next level. Simple functions can only be re-used if they are operating on the same type of object. Templates allow us to abstract the algorithm being used to apply to a wide range of different methods. By re-using a template rather than reproducing the algorithm with a new type, we reduce the number of different places where bugs can occur. If a bug is discovered, the template can be corrected, and we can be assured that fix will be applied everywhere. In contrast, the cut & paste method will leave you pouring over your code looking for the same bug repeated elsewhere with no assurance that you’ll ever find them all.
Unbridled Enthusiasm
When programmers encounter their first template, their reaction is often one of distrust and skepticism. But as you begin to get more comfortable with them, such fears tend to melt away. Once the fear is gone, the promise that templates offer becomes very alluring, and it is easy to be taken in. Naturally, it is those who are the most comfortable with templates that tend to use them the most, and it goes without saying that the authors behind your typical STL implementation are typically the most enthusiastic fans.
Like with all things in life, moderation is the key to a pattern of reliable success, and such enthusiasm can carry with it a price. What price am I thinking of?
Debugging STHELL
I was hired onto one project primarily to do performace analysis and improvement, but when it came to debugging nasty crashes, I was available for that as well. This team had fully embraced the STL paradigm, and no small number of their performance issues resulted from that, but what I want to point out here has more to do with the extreme difficulty of debugging issues that manifest themselves within STL.
In one case, I was called over to help analyze a crash, and I was so impressed by what I saw on the callstack that I copied it down. Here is the top line from the callstack - completely unedited:
default.exe!_STL::_Rb_tree<TypedAssetId<
FXShader::ExpressionEvaluator>,
_STL::pair<TypedAssetId<FXShader::ExpressionEvaluator> const ,
FXShader::ExpressionEvaluator *(__cdecl(FXShader::Effect&)>,
_STL::_Select1st<_STL::pair<TypedAssetId<FXShader::
ExpressionEvaluator> const ,FXShader::ExpressionEvaluator *
(__cdecl*)(FXShader::Effect &)> >,
_STL::less<TypedAssetId<FXShader::ExpressionEvaluator> >,
_STL::allocator<_STL::pair<
TypedAssetId<FXShader::ExpressionEvaluator> const,
FXShader::ExpressionEvaluator * (__cdecl*)(FXShader::Effect &)> > >::
insert_unique(_STL::_Rb_tree_iterator<_STL::pair<
TypedAssetId<FXShader::ExpressionEvaluator> const ,
FXShader::ExpressionEvaluator * (__cdecl*)(FXShader::Effect &)>,
_STL::_Nonconst_traits<_STL::pair<
TypedAssetId<FXShader::ExpressionEvaluator> const ,
FXShader::ExpressionEvaluator * (__cdecl*)(FXShader::Effect &)> > >
* __position=0xffffffff,
const _STL::pair<TypedAssetId<FXShader::ExpressionEvaluator> const ,
FXShader::ExpressionEvaluator * (__cdecl*)(FXShader::Effect &)>
&__v={...})
line 476 + 0x8 bytes C++
That’s a single function call, and the crash occurred somewhere within that function. What on earth is that function? A casual glance reveals that it clearly involves an STL Red-Black tree, and it has something to do with TypedAssetId and FXShader. But that function signature is so out of control… this must be some really extreme case where someone is building up some serious template right? Let’s take a look at that same line of code in the original C++:
possibleUnitsMap[ unit.m_id ] = &unit;
That’s it. This innocent-looking line of C++ translates into that incredible rat’s nest of STL – or as I like to call it, STHELL. This is no exaggeration. The example above was pasted directly out of the debugger – completely unedited.
Most programmers would scarcely dare to delve into that rat’s nest any deeper, but when you have a game that crashes, you have few options. Somebody is going to have to study the situation, determine what has caused this crash, and use that information to formulate a solution.
We did find the cause, and we got the problem fixed, but the effort that was required was greatly increased by the complexity of the template implementation for what amounts to a simple map insertion.
Speaking of Maps
Here’s another example that just came up for me last week. I was trying to figure out why the geometry was not appearing on my object in a timely fashion. As I traced the problem, I ran across this simple line of code:
return m_assetMap.find(guid, assetID);
I fully expected to find my asset in the map, so when the search failed, my immediate reaction was to assume that I had not properly loaded it. After tracing that problem through, I found that I had indeed loaded the asset and its guid was inserted into the map. So why does the search fail?
At this point, you say to yourself, “well if your guid isn’t in the map, what is in the map?” At least, that’s what I said to myself. I had avoided doing this for days now, and there was simply nothing else for it. It was time to debug the STL map. Attempting to examine the map in the debugger is an exercise in futility. Here’s what that looks like in the watch window:
-m_assetMap
-stlp_std::map<lib::utf8string,lib::utf8string,
stlp_std::less<lib::utf8string>,lib::CustomAllocator<lib::utf8string> >
-_M_t
-stlp_std::priv::_Rb_tree_base<stlp_std::pair
<lib::utf8string const ,lib::utf8string>,
lib::CustomAllocator<lib::utf8string> >
-_M_header
lib::CustomAllocator<stlp_std::priv::_Rb_tree_node
<stlp_std::pair<lib::utf8string const ,lib::utf8string> > >
-_M_data
_M_color false bool
+_M_parent 0x14f89030
+_M_left 0x03617d20
+_M_right 0x14f89090
_M_node_count 11
+_M_key_compare {...}
To get anything useful out of this data, you’re going to have to step into the template functions until you get deep enough that the debugger will actually understand what it’s looking at. Tracing down into map<>::find() I eventually got into a place where the debugger could show me the strings being compared. Here’s the callstack at that point. (In this case, I have changed some namespaces to protect the innocent guilty.):
lib::operator<(const lib::utf8string & lhs_utf8={...},
const lib::utf8string & rhs_utf8={...})
stlp_std::less<lib::utf8string>::operator()(const lib::utf8string & __x={...},
const lib::utf8string & __y={...})
stlp_std::priv::_Rb_tree<lib::utf8string,stlp_std::less<lib::utf8string>,
stlp_std::pair<lib::utf8string const ,lib::utf8string>,
stlp_std::priv::_Select1st<stlp_std::pair<lib::utf8string const ,lib::utf8string> >,
stlp_std::priv::_MapTraitsT<stlp_std::pair<lib::utf8string const ,lib::utf8string> >,
lib::CustomAllocator<lib::utf8string> >::_M_find<lib::utf8string>
(const lib::utf8string & __k={...})
stlp_std::priv::_Rb_tree<lib::utf8string,stlp_std::less<lib::utf8string>,
stlp_std::pair<lib::utf8string const ,lib::utf8string>,
stlp_std::priv::_Select1st<stlp_std::pair<lib::utf8string const ,lib::utf8string> >,
stlp_std::priv::_MapTraitsT<stlp_std::pair<lib::utf8string const ,lib::utf8string> >,
lib::CustomAllocator<lib::utf8string> >::find<lib::utf8string>
(const lib::utf8string & __k={...})
stlp_std::map<lib::utf8string,lib::utf8string,stlp_std::less<lib::utf8string>,
lib::CustomAllocator<lib::utf8string> >::find<lib::utf8string>
(const lib::utf8string & __x={...})
lib::map<lib::utf8string,lib::utf8string,stlp_std::less<lib::utf8string> >::
find(const lib::utf8string & i_key={...}, lib::utf8string & o_data={...})
FindAssetID(const lib::utf8string & urn={...}, lib::utf8string & physicalID={...})
As I wade through my map element by element, eventually I did come across the guid I was looking for. It seems the guid I was searching for was:
guid:5075987e-0000-0000-0000-000000000000
but the guid that had been inserted into the map was:
guid:5075987e-0000-0000-0000-000000000000:modelformat
Duh! Why hadn’t I seen that right away? /sarcasm off
Breakpoints in STHELL
As part of the typical debugging process, I like to go straight for breakpoints. During the course of the above example, I naively attempted to insert a breakpoint onto line 552 of _tree.h on the line that says:
if (!_M_key_compare(_S_key(__x), __k))
When I did that, Visual Studio became non-responsive. After several minutes, it was eventually clear that with that single keystroke, I had inadvertently inserted 183 breakpoints. The compiler will generate new instructions for every unique instance of the template it comes across. Apparently there are 183 different permutations of _tree currently in this codebase.
Needless to say, breakpoints in the STL of a project like this one aren’t going to be very useful tools, and that’s a terrible price to have to pay.
Summary
So what should we do about this? Should we swear off templates once and for all? Outlaw STL?
Well I wouldn’t want to throw the baby out with the bathwater. Templates are indeed powerful tools, and it would be foolish to dismiss them entirely.
What about STL?
STL is a handy suite of pre-built and pre-debugged containers. They can save you a lot of development time up-front and they typically save you debugging time down the road. If your application isn’t particularly speed-critical or sensitive to excessive memory fragmentation, STL is a great asset as long as you know what you’re doing. I advise people to avoid STL in console game engines, but I find I am seldom heeded on that. On the other hand, STL makes perfect sense for your tools.
Curb Your Enthusiasm
If you should ever find yourself in a position to reinvent an STL-like suite of templates, I urge you to remember those of us that will eventually have to decode the horrible mess the debugger is likely to make of your nested templates and specializations. Readable code is a win for everybody, and we always need to weigh that against any potential benefits we would hope to gain by creating yet another template.