February 8, 2009

On Accidents

I’ve lost one of my best jokes recently. I first heard it from the mouth of Jack Dee; “Those drawings of ‘planes landing on water they have on the safety cards … I want to see a photo: if it didn’t fly why should I believe it will float …” Of course the recent “Miracle on the Hudson” proves that it can be done, although we shouldn’t expect it that often, it needs calm water, some luck, and close to perfect flying. The thing I  found most remarkable was having pulled off a quite remarkable landing the pilot did a final check down the length of the cabin to make sure everyone had got out. That, rather than the feat of flying, justifies the use of terms like “hero”.


Since I took up scuba diving I’ve become  interested in how accidents happen. A dose of aviation in my past, combined with the good records which are available (and the death of a friend on Garuda 152) have made me more interested than perhaps I should be in air accidents.

There are three root causes of accidents: people, environment, and equipment.  In dangerous environments – whether it’s on snowy roads, under water or in the air, we have both processes and equipment to handle the danger. Where life depends on a piece of equipment either a process  or redundant equipment means a failure shouldn’t be fatal. So, where we recognise risk, accidents have a compound cause rather than an isolated one (If you think the Air France Concorde crashed only because a burst tyre, read this report on how, but for a set of other failures, it might have been saved)  My friend died because her plane was flying in limited visibility (environment) and air traffic control muddled up left and right in an instruction to the pilot (human error), but the courts found the crash would have been avoided if the Ground Proximity Warning System on the aircraft had worked advertised (equipment).

[In the week that this post has been sitting in my drafts folder, investigators have more to do than usual, and the report has been published for Colin Mcrae’s fatal crash. It’s been widely report that his pilot’s licence had lapsed – but logic, rather than my lasting admiration for him – says you can’t extrapolate from that saying he flew irresponsibly. The report says McRae was flying fast and low and “placed his helicopter in a situation in which there was a greatly reduced margin for error, or opportunity to deal with an unexpected event.” That’s the first part of a “compound cause”, but the investigators could not indentify what finally caused the accident to happen, and don’t apportion blame. So nor will I. ]

Human error can be a failure to speak-up. Think of the charge of the light brigade, read the account of the engineer who knew that the “O” rings on the space shuttle challenger would fail in the cold and watched the decision to postpone the launch being changed (scroll down to the bit after “Figure 10” if you don’t want to read it all). Or consider the Tenerife air disaster – the worst ever – which was a combination of environment (fog) and human errors. A KLM 747 (with their most senior captain at the controls) attempted to take off, without the proper clearance when a Pan AM 747 was taxiing towards it on the main runway. The  crew could didn’t seem to feel able to tell the captain to stop: to quote from one of the official reports. 

The Pan Am aeroplane responded to the tower’s request that it should report leaving the runway with an “O.K., we’ll report when we’re clear.” On hearing this, the KLM flight engineer asked: “Is he not clear then?” The captain didn’t understand him and he repeated: “Is he not clear that Pan American?” The captain replied with an emphatic “Yes” and, perhaps influenced by his great prestige, making it difficult to imagine an error of this magnitude on the part of such an expert pilot, both the co—pilot and the flight engineer made no further objections. The impact took place about thirteen seconds later.

When disasters are avoided, as they are there seem to be two themes. First, as a scuba instructor told us on a safety course “keep thinking about the options”. At the trivial end of the scale, the organizer of the event I was at in Belfast was impressed having changed airports, I had a fallback plans for the ferry if that didn’t work. I could hear Ed Harris as Gene Kranz in Apollo 13 saying “Let’s work the problem people”. Of course Apollo 13 is the other end of the scale. In the movie script at least, flight director Kranz gets quotes like “What do we got on the spacecraft that’s good?” and “I don’t care about what anything was DESIGNED to do, I care about what it CAN do.”  Apparently the captain in the Hudson ditching is also a qualified glider pilot, but the A320 wasn’t designed to be a glider, with only 3000 feet to play with and a glide ratio not much better than 10 feet forward for 1 down, it can stay airborne for only couple of minutes (and cover about half a dozen miles)… the transcript  shows 2 minutes 15 from the first call of the bird strike to the controller saying radar contact has been lost. It also shows the pilot was thinking about the river as the only viable option after 40 seconds. Worst case that would have killed everyone on the plane. Worst case trying to get to a runway was too awful. The FAA site has an MP3 from the air traffic control tapes(things begin about 7:50 in the file), the calm of the captain has drawn a lot of attention, but that of the controller also deserves a mention. When told of the bird strike he comes back with a heading for the aircraft and then tells the La Guardia to hold all departures and what the situation is, he’s also helping with other possible runways, and continuing to handle routine traffic. As yet, the cockpit voice tapes have not been made public, but I’d bet there was both relative calm there too , AND evidence of the second factor in disaster avoidance; team work. It comes up again and again whether it’s  Apollo 13,or the Gimli Glider – in the latter case an aircraft ran out of fuel at 43,000 feet, and landed on an disused air force runway: the captain was also a glider pilot, and credited the co-pilot with cockpit management of “Everything but the actual flight controls” .
There are lessons for business in this. Good IT people  know about dealing with single points of failure, and know that reliability is the result of process more then underlying technology. One article I read talks about what business can learn and refers to something General Electric CEO Jack Welch said: that effective leaders exhibit a particular set of attributes in a crisis: “forthright, calm, fierce boldness”. The survival of the Apollo 13 astronauts was at least party because the leader on the ground – Gene Kranz showed those qualities. If I think about what’s happening in the economy at the moment and look for those qualities among political and business leaders they are disturbingly rare.


