Sunday, March 25, 2007

On Slack: One Reason Why Software Maintenance is Hard

Software maintenance tends to be hard in part just because software tends to be hard. Most of our effort is spent on quite complicated software, running to hundreds or thousands of pages in printed form. And software tends to be fragile: for example, any small typographical error anywhere in all those pages tends to invalidate the program, or just to subtly corrupt it, so it behaves incorrectly occasionally.

However, I wish here to remark about software maintenance in particular, the problem of taking old software and making it better, or adapting it to a new purpose. And in particular, I want to talk about adapting old software to a new purpose, as opposed to finding and fixing lingering mistakes from when the software was built for its original purpose.

I have spent a lot of my mental effort over the past few years on "incremental algorithms." Such algorithms efficiently maintain a solution for a problem as the problem changes. For example, an ordinary sorting algorithm, when given input [101,8,4,11,14,88,89,16,12], might return [4,8,11,12,14,16,88,89,101]. An incremental algorithm allows an operation like "now insert 55 in that sequence," and efficiently returns [4,8,11,12,14,16,55,88,89,101] in that case.

Of course, even without any special incremental algorithm, you can just use an ordinary sort algorithm to find that answer: insert 5 into the input sequence, then restart the sort algorithm from scratch to build a new output sequence. However, for enormous input data sets, an incremental algorithm can be enormously more efficient. For exmaple, if you maintain a telephone directory covering much of the USA, containing perhaps a hundred million phone numbers, then re-running an ordinary (non-incremental) sort operation from scratch will tend to require at least checking the order of a hundred million pairs of numbers. (More or less, the algorithm must at least re-check the order as it runs.) An incremental algorithm needs less than one hundred comparisons to insert a new element in an existing ordered set of one hundred million elements.

One thing I have noticed, which I'm sure others have noticed and even perhaps formally analyzed, is that incremental algorithms need to have some slack in their representation of the problem. Consider the incremental sorting algorithm again, but this time not for telephone numbers but for library books in card catalog order. Any library will have some gaps on the shelves, quite intentionally, for libraries face the problem of inserting new books in the order. When a new book is delivered, a librarian wants to walk to the appropriate location, slide aside a few dozen books, and insert the new book. A library could economize on space by eliminating all the gaps. However, then inserting a new book in order could require quite a lot of work, sliding about half the library's books aside.

Similar things happen in filing systems inside your computer. On a computer's hard disk, where new files may be created, and old files may be expanded or deleted, usually quite a lot of space (20% or more) is reserved for such slack. This is not just a property of the way computers read files, for such slack does not usually exist on something like a CD-ROM, which will never be changed. It is instead a property of "who knows what the future will bring?"

It is an interesting challenge finding the appropriate form of slack for a given set of anticipated changes. First, though, note that the appropriate slack really does depend on the anticipated changes. To go back to the library example, the gaps in the shelf aren't just common sense, agnostic about the kinds of anticipated changes. Instead, they are more nearly slack specifically designed to cope with the anticipated problem of inserting a new book. Consider an unanticipated challenge, when Tom Gander seizes power, decrees that T and G are now the first two letters of the alphabet, and starts exterminating librarians who don't comply. At that point, the gaps on the shelves don't help, if you are going to comply efficiently, you need is wheels on the bookcases. (Or perhaps you should think outside the box and maintain an arsenal or cross-border escape tunnel in the library basement.)

I've been thinking about the problem especially in the last few weeks because I finally succeeded in finding a promising form of slack for a problem I've been working on, after thinking about for (not full time, but off and on) only a few years. It's a slightly tricky problem, related to the one solved here (where given any directed graph, one must return a minimal directed graph which is congruent to it). But compared to the problem solved by programmers (where given any problem specification, one must return a piece of software which solves the problem) it is hardly tricky at all. Thus, since it was hard for me to find a suitable form of slack to cope with one anticipated set of changes in my slightly-tricky problem of directed graphs, it now makes a lot of sense to me that it can be really hard to find suitable forms of slack to cope with anticipated small changes to programming problems so that it's easy to make a small change the program to handle the changed problem. (That's the main point of this post, any programmers reading this can stop now.)

Besides that, for any non-programmers reading this post, I'll illustrate an extra point, using the magnificent foundation of examples that I have constructed: unanticipated changes in software can be surprisingly hard. (You programmers already knew this, why didn't you go home?) If you don't know anything about the inside of a library, it might not be clear that changing T and G to the first two letters of the alphabet is harder than shelving a dozen boxes of new books. After all the T-and-G change is so simple that it can be described in a single sentence, while the dozen-boxes problem requires many pages to describe. (I.e., the receipt listing the titles of all the new books is many pages long.) However, in an ordinary library, the T-and-G change would probably be a lot more work. So if you are ever tempted to speculate how easily a software system can easily be changed to accommodate a simple change, remember that the answer is "it depends." By that I don't only mean that it depend on whether the programmers are lazy morons (though it does:-), but that it depends on design choices inside the software system, and those design choices depend in part on whether the change was anticipated around the time of the original design. (This is related to this; not quite the same thing, but said enough better than I can that nonetheless it deserves a nod.)

For example, the SBCL project (a free compiler that I set up years ago, and whose maintenance I still formally head) experiences chronic headaches because its design (made over twenty years ago in the CMU CL system that I copied for the original SBCL code) calls for most of its code to be stored at a fixed address in the computer's memory. Twenty years is a long time in the computer world. In the kinds of computers that SBCL usually runs on (general-purpose computers running operating systems like Linux and Microsoft Windows, not special-purpose computers like the ones which control antilock brake systems), it is now pretty abnormal for application software to expect to be stored at a fixed address in memory. Thus, this design now causes various kinds of friction. However, it looks like it would be a lot of work to rework the system to use an approach which is less surprising to modern operating systems. It's not an impossible amount of work, but it's enough that I find it unsurprising that despite all the headaches no one has done it yet.


Post a Comment

<< Home