Tuesday, January 15, 2008

Rule of troubleshooting: first, fix what you KNOW is broken

There's a technique I've picked up from programming: when you have a problem, fix what you know to be broken first. Further, fix the most obvious broken thing first.

I think this started with C programming with cc and gcc. When you compile your programs, it dumps out a slew of errors it finds. I learned that the first error listed might be the only true error; the others were just shown because the compiler got all confused by the first one. Often, when I fixed that first error, the rest of the symptoms went away.

This technique seems to apply to other domains too -- network troubleshooting, in particular. There might be a pile of symptoms and a few known problems, and it may be very difficult to explain all the symptoms based on the known problems. But these are complex systems. That means you can't easily see it all in your head. 

So if you have actual problems that need to be fixed, fix those problems and see if the problems go away

No comments: