Monday, May 16, 2011

Should Exceptions be Exceptional?

It is frequently stated that exceptions should be exceptional. The idea behind this statement is that one should not throw (or catch) exceptions which will be thrown during the normal course of program operation. The idea is that one should use standard conditional logic for that case and only use exceptions to deal with situations that are truly exceptional.

For example consider a simple ParseInt function. This function converts a string such as "42" into a number such as 42. However, the string can contain non-numeric characters or perhaps be too large, causing the function to fail. There are three strategies to deal with this:
  1. Throw an exception if the function fails.
  2. Provide a CanParseInt function which checks whether the string can be parsed.
  3. Provide an additional boolean output from ParseInt indicating whether it has succeeded.
 Advocates of "exceptions should be exceptional" prefer options 2 or 3 to option 1. They argue that you shouldn't throw an exception when its sensible to use typical conditional logic i.e. if blocks. It is a fairly regular occurrence to attempt to parse a string which doesn't actually contain an integer.

The CanX, DoX method of handling this problem is a bad idea for a number of reasons. The idea behind this method is that you test whether an action will succeed before actually attempting it. The pattern looks something like:

if( CanParseInt(string) )
    value = ParseInt(string);
    return false; // indicate failure

The first problem is that the code specifies what it intends to do twice. Both CanParseInt and ParseInt indicate the same information. Having them both there just increases the clutter in the code. Additionally, the two functions will duplicate effort. CanParseInt has to spend all the effort necessary to ensure that all the characters in string are valid digits. Parse is just going to have to do all of that work over again.

Another scenario has an additional problem.

if( CanOpenFile(filename) )
    File file = OpenFile(filename);
    // play with file

The problem is that anything can happen between CanOpenfile and OpenFile. The file could be deleted, opened by another program, etc. The fact that CanOpenFile returns true does not guarantee that OpenFile will succeed.

Using the CanX/DoX method of handling errors is broken. It complicates the code, waste processor time, and is sometimes just plain wrong. Clearly, that is not a good strategy. This leaves the two remaining methods. Either we throw an exception or somehow indicate via return value or similar that the function failed. Both are somewhat similar in that you try to do something, and then get a report back if it fails.

The question is, which is a better way of indicating failure. Let's look at some code

Firstly, exception based:
    this->value = ParseInt(string;)
catch( NumberParseError )
    throw new UserInputError("Expected something else");

Secondly, return value based:

validNumber = ParseInt(string, this->value);
    return false;

The return value based method makes use a out parameter, one that modifies its argument. This is undesirable because its not obvious that is what is happening. However, that particular point could be solved by using a language which allowed multiple return values. Also, its not a problem if there is a suitable sentinel value (such as null)

The fact that exceptions are not being used prevents the return value based method from including any information about why it failed. This can be handled by making use of another technique such global variables or similar to store additional error information.

I think there is a slight readability win for the exception based method here. However, both techniques come off fairly similarly. However, difference arise if we consider what happens if we don't explicitly handle the case of an invalid string.

this->value = ParseInt(string;)

The exception based version will raise an uncaught exception which will fall into whatever uncaught exception strategy that I am using. In my case that means the exception will probably take down the program and then send me an e-mail with a stack trace and any other information.

ParseInt(string, this->value);

The return value version will simply try to continue as if nothing has happened. this->value will probably be left with whatever value was in it before. Nobody will notice that parsing did not happen as it should and who knows what will result from that.

I think it much better to die in a flaming crash when something like this happens rather then to try and continue on. If one continue one using invalid options there is not telling what the result will be.

One might be inclined to argue that one should always check for this error. However, in some cases it will be impossible to get the error. For example, the numbers could be derived from a capture expression in a regular expression. In that case trying to handle the error would be silly. However, if a bug in my code leads to non-numeric data being pushed into there, I want to know.

The other item which advocates of "exceptions are exceptional" will bring up is that exceptions are slow. This isn't quite true. The implementation of exceptions in many popular languages are slow. They are designed to be used infrequently, and thus there is not a great deal of concern for making them fast. On the other hand, some languages such as Python have an efficient implementation of exceptions. The exceptions are slow because people use them infrequently. There is nothing inherently problematic about using exceptions frequently.

However, you have to code within the idioms of a particular language. You shouldn't write python as if it was Java, and you shouldn't write Java as if it was python. In languages where exceptions are discouraged during normal execution, you shouldn't throw exceptions. Doing so makes your unidiomatic for your language and thus slow and possible confusing.

No comments:

Post a Comment