## Friday, June 10, 2011

### Looping the Right Way

Many programmers write their loops by starting with a loop over what they have and then producing code to determine what needs to happen for each iteration. However, the result is often code which is hard to read. The underlying problem is that the algorithm they are describing is often naturally implemented by looping over something else. However, the code is awkward as a result of having to adjust for being implemented in a straightforward iteration.

Consider an example of a run length encoder:

`last = Nonecount = 0for letter in input_text:    if letter == last:        count += 1    elif count != 0:        packed = struct.pack(">Bc", count, letter)        output.write(packed)        count = 0    if count == 255:        packed = struct.pack(">Bc", count, letter)        output.write(packed)        count = 0    last = letterif count != 0:    packed = struct.pack(">Bc", count, last)    output.write(packed)`

Following the logic of this code is somewhat difficult. It is also repeats important elements in multiple places. The problem is that the code considers the input letter by letter. However, the run length encoding algorithm really considers the input in blocks of the same character. As a result, the looping logic has become intertwined with the output logic which results in them being harder to read. However, we can easily seperate the two of them.

`blocks = []last = Nonecount = 0for letter in input_text:    if letter == last:        count += 1    elif count != 0:        blocks.append( (letter, count) )        count = 0    if count == 255:        blocks.append( (letter, count) )        count = 0    last = letterif count != 0:    blocks.append( (letter, count) )for letter, count in blocks:    packed = struct.pack(">Bc", count, letter)    output.write(packed)`

Now we've clearly seperated the output and looping logic. The problem with this technique is that it creates a temporary list of blocks. If the data in question is too large this is going to be a problem. However, Python solves this problem with a generator feature. Generator act like lists except that they generate the data as its asked for instead of at the beginning. This also means that you can only iterator once through a generator. Here is an example of the code rewritten to take advantage of that.

`def blocks():    last = None    count = 0    for letter in input_text:        if letter == last:            count += 1        elif count != 0:            yield letter, count            count = 0        if count == 255:            yield letter, count            count = 0        last = letter    if count != 0:        yield letter, countfor letter, count in blocks():    packed = struct.pack(">Bc", count, letter)    output.write(packed)`

However, a further consideration of the blocks function reveals that it does two things. Firstly, it combines repeats of letters. However, it also restricts the size of each block to 255. We can split these into two functions:

`def whole_blocks():    last = None    count = 0    for letter in input_text:        if letter == last:            count += 1        elif count != 0:            yield letter, count            count = 0        last = letter    if count != 0:        yield letter, countdef blocks():    for letter, count in whole_blocks():         while count > 255:              yield letter, 255              count -= 255         yield letter, count`

Now we can actually see the logic more clearly. However, another advantage arises because such small pieces can often be reusable. In fact, whole_blocks can be implemented by using the itertools.groupby method. It divides an iterator up into group of elements which are equal. If we recast the solution in terms of that we get:

`def blocks(count, maxsize = 255):     while count > maxsize:          yield maxsize          count -= maxsize     yield countfor letter, items in itertools.groupby(input_text):    for block_size in blocks(len(list(items))):        packed = struct.pack(">Bc", count, letter)        output.write(packed)`

This code is easier to read and much less error prone then the original code.

The take home lessons:

1. Don't iterate over what you have, decide what you want to iterate over for your algorithm and then figure out how to iterate over that.

2. Divide complex tasks into a number of serials tasks that can be applied to a collection

3. If you find yourself adding flags other variables which change from iteration to iteration, you should look for an alternative

## Monday, June 6, 2011

### Efficiency

It's been often repeated that "premature optimization is the root of all evil." However, an often missed aspect of this issue is that it is surprisingly common that one can have your cake and eat it too. The most straightforward algorithm to solve a problem is often not the most efficient. However, since it takes additional time to both write and debug the more efficient algorithm, it is often the case that the benefit is not worth the cost. The admonishment therefore is to avoid worrying about optimizing code until actual performance measurements indicate that the code in question is a noticeable amount of the processing time.

However, there is a way to keep your code simple, easy to write, and yet take advantage of faster algorithms. All that is necessary is for your code to call previously written and debugged code which performs the necessary function. If doing this, your code should be simpler because it doesn't have to implement any algorithm whatsoever. As a result, doing so allows both the simplicity and speed.

Let's consider the example of a word count program. This program reads in text from the output and counts the number of occurrences of the words in the text. Here we'll worry about the problem of keeping track of the count of all the words.

Firstly, consider the necessary function written in C. The simplest way to implement the necessary behavior would seem to be as a linked list structure. The following code implements such a function.

```void increment_word_count( Node * node, const char * word)
{
// If the current is the correct word, increment the count
if( strcmp(node->word, word) == 0)
{
node->count++;
}
// if there is a next node, keep looking
else if( node->next )
{
increment_word_count(node->next, word);
}
// if we reach the end of the linked list without finding the word
// create a new node
else
{
node->next = malloc(sizeof(Node))
node->next->word = word;
node->next->count = 1;
}
}
```

This algorithm has O(n) time for each word. It will also run into trouble with its recursive function call and will quickly run out of stack space.

On the other hand, consider how a C++ programmer would write this function. C++ contains a richer standard library then C and thus has a considerable portion of already written code. This particular case works well with the map template.

```void increment_word_count( std::map<std::string, int> & word_count, std::string word)
{
word_count[word] += 1;
}
```

Note that this code is much simpler then the C code. However, it implements a more complicated algorithm. Underneath, the STL using a tree structure to ensure that the operation being performed is done in O(log n) time. Thus by using already written code we can get both speed and simplicity.

Don't start by attempting to write the simplest algorithm that will get the job done. Start seeing whether you can avoid writing any algorithm by locating an already written efficient version of the algorithm.