Monday, June 6, 2011

Efficiency

It's been often repeated that "premature optimization is the root of all evil." However, an often missed aspect of this issue is that it is surprisingly common that one can have your cake and eat it too. The most straightforward algorithm to solve a problem is often not the most efficient. However, since it takes additional time to both write and debug the more efficient algorithm, it is often the case that the benefit is not worth the cost. The admonishment therefore is to avoid worrying about optimizing code until actual performance measurements indicate that the code in question is a noticeable amount of the processing time.

However, there is a way to keep your code simple, easy to write, and yet take advantage of faster algorithms. All that is necessary is for your code to call previously written and debugged code which performs the necessary function. If doing this, your code should be simpler because it doesn't have to implement any algorithm whatsoever. As a result, doing so allows both the simplicity and speed.

Let's consider the example of a word count program. This program reads in text from the output and counts the number of occurrences of the words in the text. Here we'll worry about the problem of keeping track of the count of all the words.

Firstly, consider the necessary function written in C. The simplest way to implement the necessary behavior would seem to be as a linked list structure. The following code implements such a function.

void increment_word_count( Node * node, const char * word)
{
// If the current is the correct word, increment the count
if( strcmp(node->word, word) == 0)
{
node->count++;
}
// if there is a next node, keep looking
else if( node->next )
{
increment_word_count(node->next, word);
}
// if we reach the end of the linked list without finding the word
// create a new node
else
{
node->next = malloc(sizeof(Node))
node->next->word = word;
node->next->count = 1;
}
}

This algorithm has O(n) time for each word. It will also run into trouble with its recursive function call and will quickly run out of stack space.

On the other hand, consider how a C++ programmer would write this function. C++ contains a richer standard library then C and thus has a considerable portion of already written code. This particular case works well with the map template.

void increment_word_count( std::map<std::string, int> & word_count, std::string word)
{
word_count[word] += 1;
}

Note that this code is much simpler then the C code. However, it implements a more complicated algorithm. Underneath, the STL using a tree structure to ensure that the operation being performed is done in O(log n) time. Thus by using already written code we can get both speed and simplicity.

Don't start by attempting to write the simplest algorithm that will get the job done. Start seeing whether you can avoid writing any algorithm by locating an already written efficient version of the algorithm.

No comments:

Post a Comment