# The Value of Canonical Code

## The Value of Canonical Code

As part of our internal bug reporting infrastructure, we had a piece of code to draw a bar chart which in essence was

``DrawRect(x, x+nBarSize*n/nMax, yMin, yMax);``

to draw a bar for value `n` relative to some maximium value `nMax`. I omitted some parameters which are irrelevant for this post. We also have a datatype for rectangles that is passed instead of passing the coordinates one by one, but that is not today's topic.

The code worked most of the time, but sometimes `nMax` was 0 and we got a division by zero exception.

``DrawRect(x, x+nBarSize*n/std::max(1,nMax), yMin, yMax);``

In the review, we had a discussion how this should be written best. If `nMax` is 0, we do not want to draw any bar, so we may as well not call `DrawRect` at all:

``````if(0<nMax) {
DrawRect(x, x+nBarSize*n/nMax, yMin, yMax);
}``````

But `nMax` is actually always greater or equal to `n`, and if `n` is 0, we are also not drawing anything. So we can also write

``````if(0<n) {
DrawRect(x, x+nBarSize*n/nMax, yMin, yMax);
}``````

Which should we pick? In terms of performance, they are indistinguishable, because the case of `n` being 0 is pretty rare and drawing an empty rectangle is not a big penalty. In terms of code complexity, they are all comparable as well. The last two both need one branch more than the original code. The first one needs a `std::max`, which is a branch and in this case a constant, but it is 0, so again, no big difference.

I would still argue that in a given codebase, it is still a good idea to have agreement on which code to pick. At think-cell, we call this the canonical solution. If you do not have agreement and everyone does what they please, the reader of the code (and we all know we write code primarily for human readers and only secondarily for the compiler) may ask herself if there is a deeper reason why the first variant was picked over the second or vice versa:

• The first variant allows `n` to be greater than `nMax`. Is it ever and that is why it is not written as `0<n`?

• The second variant does not draw anything for negative `n`. Does that mean that negative `n` occur?

My point here is that different variants may raise different questions, and only if there is a canonical variant, found through some agreed rules, the reader will be more inclined to think that none of these complexities play a role here: `n` is always positive and `nMax` is greater or equal to `n`, as the name suggests.

At think-cell, if there is no other factor, we prefer simpler code over more complex one, which is probably something we can all agree on. But in this given example, the complexities of the three variants are quite similar, in particular of the last two. So we need another tie breaker. At think-cell, we pick "less work done", which favors the last variant. We do not do this to make our program faster. Only profiling can tell if it really does. We only do it to have a sensible tie breaker that lets us agree on what is canonical.

— by Arno Schödl