Saturday, May 21, 2011

Dumbing it down, intelligently

There are problems that seems to be solvable only by throwing more and more code at it -- we call it the brute force method. It works in most situations, but it rarely produces elegant code that survives bit rotting. The brute force method is actually surprisingly common -- how many times haven't developers solved bugs by adding yet another if-statement?

Luckily, there is another way of solving such problems that result in simple and (potentially) elegant code -- here, I call it the dumb-it-down method. The dumb-it-down method achieves simplicity by solving a similar, more general, problem.

While the brute force method result in masses of code that covers every possible corner case of a very specific problem, the dumb-it-down method results in much less, more general, and simpler code. Let's take an example to clarify what I mean.

Let's assume there is a function has_python() that checks if python is installed on a machine. This function is used for making sure that a certain Python script can be executed. How is such function implemented? It needs to check permissions, check paths, check the python version, etc. There is a lot of intelligence needed to be implemented to make sure that the script can be executed, right? This is the brute force method.

Ok, now let's rewrite the problem slightly. Remember that has_python()'s purpose in life is to make sure that a certain python script can be executed. So, we can just as well write a function that executes the python script and return whether or not it succeeded, right? As it turns out, that function is called exec and is built into the standard C library. No need to write a single line of code!

In the above example we rewrote the problem slightly but kept the intention: execute the script only if it can be executed. This is the pattern; this is the idea behind the dumb-it-down method -- look at what the code tries to do on a more general level and find the a simple (or 'dumb' if you want) way of implementing that.

I think the dump-it-down method is an umbrella term covering many design strategies which aim are to produce simple long-living code. I've previously discussed this here.

It seems like software is in a never-ending spiral of complexity: software needs to grow more complex to manage other complex software. Why is an IDE larger than the operating systems from last decade? Are today's IDEs solving more complex problems than the operating systems did? How can the plug-in APIs of said editor be more complex than the C standard library? Aren't the APIs supposed to make it easy to implement plugins?

I've mentioned it before in this blog, but it needs to be repeated. We software developers, need to start looking at our work (and the result of out work) differently. Our skills are not measured by the complexity of the programs we write, but the simplicity of them (assuming equal functionality).

Saturday, May 14, 2011

Forgotten treasures I: Look-up tables

Think fast: how would you convert a bool to a string, "TRUE" or "FALSE", in C? Most of us probably answers "with an if-statement", and when asked to show some code doing it end up with something like:
const char* theString;
if (theBool)theString = "TRUE";
elsetheString = "FALSE";
or some similar code using the conditional operator as follows:
const char* theString = (theBool ? "TRUE" : "FALSE");

But there is at least one more way of doing it, which in some cases gives more readable code. I'm speaking of look-up tables. The code above can be expressed by look-up tables like this:
static const char* options[] = { "FALSE", "TRUE" };
const char* theString = options[theBool];
which is much less code compare to the above if-then-else version. It's slightly more code, though, compared to the conditional operator-version. However, I personally find code using the conditional operator hard to read. I know perfectly well how the conditional operator works, but I still need a few more brain cycles to read such code.

On the other hand, I have no problems at all reading code using look-up tables since code like options[theBool] looks very similar to a normal function call. Of course, there are down-sides to using look-up tables; the most important one is that the result is undefined if you index outside the range of the table.

Another way of using arrays like this is when you need to decode a tree structure. Let's say we have three ints which represents a path in a tree.
   Node root = { "Root", layer1Nodes[node1], 0 };
   Node layer1Nodes[] = { { "L1A", layer1ANodes[node1]  } };

During the last year I've gone from primarily developing Java to write C++, C, and assembler. I have to admit, there are a few idioms that I have rediscovered in this transition to C and assemble. Look-up tables are on such treasure.

Of course, there is now real reason why look-up tables can't be used in, e.g., Java, but it seems like the coding traditions in Java discourage such code. Other languages encourage such  code; for instance lists and dicts are very common in Python as a tool for simplifying expressions. It's a bit funny what different communities value in their code.