Imagine you could run parts of your program before it even compiles into an executable. Instead of calculating a value every time the user runs your app, the value is already baked into the code, ready to go. This is the core idea behind a powerful, advanced C++ technique called Template Metaprogramming (TMP).
TMP treats the C++ template system as a
Turing-complete functional programming language. Instead of variables, we have static
const members. Instead of functions, we have
template structs. And instead of loops, we have recursion.
Recently, I came
across some fascinating TMP code. In this article, we’ll explore how TMP works
by “teaching the compiler” to generate unique numerical encodings for date
formats—entirely at compile time.
#include <stdio.h>
enum { Y, M,
D };
template<unsigned
F, unsigned W = 2>
struct datefield
{
static const unsigned type = F *
10 + (W
% 10);
};
template <typename T1, typename T2 = void, typename T3 = void>
struct dateformat
{
static const unsigned pow10 = 100 * dateformat<T2, T3>::pow10;
static const unsigned value = pow10 * T1::type + dateformat<T2, T3>::value;
};
template <>
struct dateformat<void, void, void>
{
static const unsigned value = 0;
static const unsigned pow10 = 1;
};
enum
{
YYYYMMDD = dateformat<datefield<Y, 4>,
datefield<M>, datefield<D>>::value,
DDMMYY = dateformat<datefield<D>, datefield<M>, datefield<Y>>::value,
YYYYMM = dateformat<datefield<Y, 4>,
datefield<M>>::value,
};
int main() {
printf("dateformat<Y, 4>=%u,
dateformat<Y>=%u, dateformat<M>=%u, dateformat<D>=%u\n",
dateformat<datefield<Y, 4>>::value, dateformat<datefield<Y>>::value,
dateformat<datefield<M>>::value, dateformat<datefield<D>>::value);
printf("YYYYMMDD=%u, DDMMYY=%u,
YYYYMM=%u\n", YYYYMMDD, DDMMYY,
YYYYMM);
return 0;
}
Any guesses what this program
prints? We’ll find out shortly.
A Different Way of Thinking
Normally, we use C++
templates to create generic code, like a vector that can hold any type. In TMP, we use
templates as a mini-programming language that the compiler itself executes.
Here are the rules for
this "language":
- Variables don't vary: They are compile-time constants, usually defined
with static const.
- "Functions" are
structs: We "call" a
function by instantiating a template struct. The "return value"
is a static const member
inside it.
- Loops are done with recursion: We make a template call itself with slightly
different parameters until it hits a "stop" condition.
Step 1: The Basic "Function" - datefield
Let's look at the
simplest piece of our program:
template<unsigned
F, unsigned W = 2>
struct datefield
{
static const unsigned type = F *
10 + (W
% 10);
};
Think of datefield as a simple function. It takes two
numbers at compile time (F for field and W for width) and calculates a new number
called type.
When the compiler
sees datefield<D>, it knows D is 2 (from
the enum) and the default W is 2.
It immediately calculates:
static const unsigned type = 2*10 + (2 % 10);
...and determines
that datefield<D>::type is the constant 22. This calculation happens during
compilation, not at runtime.
Step 2: The Recursive Engine - dateformat
This is where the
magic happens. We need to combine multiple datefields.
template <typename T1, typename T2 = void, typename T3 = void>
struct dateformat
{
static const unsigned pow10 = 100 * dateformat<T2, T3>::pow10;
static const unsigned value = pow10 * T1::type + dateformat<T2, T3>::value;
};
This is our recursive
"function". Look closely at the value calculation. To figure out the value for dateformat<T1, T2, T3>, the compiler realizes
it first needs to figure out the value for dateformat<T2, T3>. This is a recursive call!
It causes the compiler to peel off the first datefield and re-run the process on the rest of
the list.
But what happens when
we run out of fields?
Step 3: The Stop Sign - The Base Case
A recursive function
that never stops is an infinite loop. In TMP, this causes a compilation error.
We need a "base case" to tell the compiler when to stop.
template <>
struct dateformat<void, void, void>
{
static const unsigned value = 0;
static const unsigned pow10 = 1;
};
This is a template
specialization. It's a specific rule that says: "If you ever see dateformat with no fields (void, void, void), don't use the main recursive template. Use
this one instead." This template provides simple, fixed values (value = 0) and stops the recursion.
Tracing the Compiler's "Thoughts"
Let's follow the
compiler as it calculates YYYYMM.
- You ask for: dateformat<datefield<Y,4>,
datefield<M>>::value.
- Compiler says: "Okay, to get that value,
I need to instantiate dateformat<datefield<Y,4>,
datefield<M>>. The formula requires me to first find
the value from dateformat<datefield<M>>."
- Compiler says: "Now I need to instantiate dateformat<datefield<M>>.
The formula requires me to first find the value from dateformat<void,
void, void>."
- Compiler says: "Aha! I have a special rule for dateformat<void,
void, void>. Its value is 0 and its pow10 is 1. The recursion stops here."
Now the compiler can
work its way back up, calculating the final values:
- Finishing dateformat<datefield<M>>:
- pow10 = 100 * 1 (from base case) = 100
- value = 100 *
datefield<M>::type + 0 (from
base case) = 100 * 12 + 0 = 1200
- Finishing dateformat<datefield<Y,4>,
datefield<M>>:
And it's done! The
compiler determines that YYYYMM is the constant 41200. When you run your program, this number is
already computed and stored in the executable, making it incredibly fast.
Pros and Cons
Pros:
- Zero Runtime Overhead: All calculations are performed by the compiler. The
resulting values (YYYYMMDD, DDMMYY,
etc.) are hard-coded into the executable as if you had typed the numbers
yourself. This is extremely efficient.
- Type Safety and Expressiveness: The format is defined declaratively (e.g., dateformat<datefield<Y,4>,
datefield<M>>). This is more expressive and less prone
to "magic number" errors than manually calculating and
defining #define YYYYMM 412.
The compiler validates the structure.
- Extensibility: It's easy to define new formats without changing the
core logic.
Cons:
- Complexity: The code is difficult to read and understand for
developers not familiar with template metaprogramming. The recursive
nature and separation of value and pow10 can
be confusing.
- Compiler-Intensive: For very deep or complex template recursions, compile
times can increase significantly.
- Debugging Challenges: Debugging TMP code is notoriously difficult. Errors
are reported as complex template instantiation failures, which can be
cryptic and hard to trace back to the logical error.
- Limited Functionality: This approach is suitable for generating constants.
It cannot be used for runtime parsing or formatting of date strings, which
would require a completely different, traditional runtime
implementation.
History is evolving
In the early days of C++, TMP was a
revelation. It was the primary method for performing complex compile-time
logic. It was the engine behind foundational libraries like Boost and was used
for creating highly efficient code (e.g., expression templates that eliminate
temporary objects in math libraries) and for type introspection (type traits).
The example code is a perfect illustration of this classic, powerful, but
syntactically complex style.
P.S. I fed the code to a few AI tools—funny enough,
they gave different outputs:
- by chatGPT
dateformat<Y,
4>=400, dateformat<Y>=200, dateformat<M>=1200,
dateformat<D>=2200
YYYYMMDD=4122200,
DDMMYY=22120200, YYYYMM=41200
- by deepseek
dateformat<Y,
4>=400, dateformat<Y>=200, dateformat<M>=1200,
dateformat<D>=2200
YYYYMMDD=4122200,
DDMMYY=22120200, YYYYMM=41200
- by GPT-5-Codex (Preview)
dateformat<Y,
4>=400, dateformat<Y>=200, dateformat<M>=1200,
dateformat<D>=2200
YYYYMMDD=4122200,
DDMMYY=22120200, YYYYMM=41200
- by Claude Sonnet 4.5 (wrong):
Individual datefields:
datefield<Y,4>::type
= Y*10 + 4 = 0*10 + 4 = 4
datefield<Y,2>::type
= Y*10 + 2 = 0*10 + 2 = 2 (default W=2)
datefield<M,2>::type
= M*10 + 2 = 1*10 + 2 = 12
datefield<D,2>::type
= D*10 + 2 = 2*10 + 2 = 22
For YYYYMMDD =
dateformat<datefield<Y,4>, datefield<M>,
datefield<D>>::value:
Works through: 4 +
12100 + 2210000 = 4 + 1200 + 220000 = 221204
For DDMMYY =
dateformat<datefield<D>, datefield<M>,
datefield<Y>>::value:
Works through: 22 +
12100 + 210000 = 22 + 1200 + 20000 = 21222
For YYYYMM =
dateformat<datefield<Y,4>, datefield<M>>::value:
Works through: 4 +
12*100 = 4 + 1200 = 1204
Output:
dateformat<Y,
4>=4, dateformat<Y>=2, dateformat<M>=12, dateformat<D>=22
YYYYMMDD=221204,
DDMMYY=21222, YYYYMM=1204
- by Gemini 2.5 pro (wrong)
dateformat<Y,
4>=4, dateformat<Y>=2, dateformat<M>=12, dateformat<D>=22
YYYYMMDD=41222,
DDMMYY=221202, YYYYMM=412