What Six Sigma Has to Say About Reducing Failures

I spent most of my career at General Electric (GE), and was fortunate to grow my career when Jack Welch required aspiring managers to get a Six Sigma certification. This wasn't always popular in IT, since (at the time) GE still primarily taught Six Sigma from a manufacturing perspective. Performing variance analysis on the lengths of bolts in a manufacturing batch didn't always seem relevant when your job was developing software. Yet it's amazing how frequently I still refer to Six Sigma concepts. Jack Welch was right...go figure.

One of the tools we used frequently, even in IT, was the Failure Mode and Effects Analysis (FMEA). If you've used this, you know that the FMEA helps reduce risk in the planning process for a new system -- and is used consistently throughout the life of the system to identify new risk. The FMEA takes three inputs for each failure you're analyzing: the likelihood, the impact, and the detectability. Most of us are familiar with a risk equation that takes two parameters: likelihood and impact. The FMEA model is especially appropriate for planning for information security events. Introducing the "detectability" factor forces you to think about logging, instrumentation, event analysis, and the entire detection process. Next time you're assessing the risk of a system with the traditional model, consider whether your results would be different if you also assessed detectability.

More information on FMEAs can be found here: http://www.isixsigma.com/tools-templates/fmea/quick-guide-failure-mode-and-effects-analysis/ and templates can be found at several sites online.