![]() |
![]() |
|
Six Sigma
Quality is a
popular approach to process improvement, particularly among technology
driven companies such as Allied Signal, General Electric, Kodak and
Texas Instruments. Its objective is to reduce output variability
through process improvement, and/or to increase customer specification
limits through design for producibility (DfP), so that these
specification limits lie at more than "six" standard deviations, or
sigma's, from the process mean (I'll explain the quotation marks later).
In this way, defect levels should be below 3.4 "defects per million
opportunities" for a defect, or "dpmo" for short.
Although
originally introduced by Motorola in 1986 as a quality performance
measurement, 6 sigma has evolved into a statistically oriented approach
to process improvement. It is deployed throughout an organization using
an army of champions and experts called "black belts," a title borrowed
from their martial arts counterparts. They command a rank-and-file
made up of teams focusing on the improvement of the organization's
processes. Just search the internet for "six sigma" and you'll come up
with several informative descriptions of its history and current
practice. The Six Sigma Academy, a Motorola spin-off, provides
consulting service to many of the leading practitioners of this
approach. What I want to focus on here though, is the 6 sigma metric
itself, not the concept or the approach.
I don't like the 6
sigma metric. As you'll see, it fails to pass many of the tests that
I've previously established for "good" metrics and described in Part 1
of Metrics for the Order Fulfillment Process. In particular, it's
neither simple to understand nor, in most applications, an effective
proxy for customer satisfaction. It does not have an optimum value of
zero. And, its definition is ambiguous and therefore easily gamed
because there is no accepted test for what to include as an
"opportunity" for a defect.
What is an "opportunity"?
I've
trained improvement teams, team leaders, and black belts for one of the
aforementioned companies in their 6sigma metrics module. Once they get
through the distinction between defects vs. defectives and attribute
vs. variable data the greatest difficulty that the trainees encounter is
in determining what constitutes an opportunity for a defect.
Obviously, by increasing the number of opportunities (the denominator of
dpmo), you can improve the metric, particularly if you include
opportunities that are not important to customers and consequently are
not routinely checked for conformance, thereby allowing their defects to
go uncounted.
This weakness can be overcome (but seldom is in
practice) by applying an objective weighting for defect severity in
counting both opportunities and actual defects. For example, critical
defects, ones that make the output unusable by the customer, get a
weighting of one while inconsequential defects get a weighting of zero.
Cosmetic defects or ones that can be corrected or compensated for have
values in between, depending on the relative cost of correction or their
likely impact on the customer's repurchase decision. A similar
approach is taken in Failure Mode and Effect Analysis (FMEA) where
improvement priorities are set based on a combination of frequency of
occurrence, severity and detectability of candidate failure modes. I
understand that the TI flavor of 6sigma does include this type of logic.
Where should the weightings come from? The customer of the process,
of course (but, more about this in a future installment in this series,
if there's sufficient interest). Current practice usually leaves the
choice of what constitutes an opportunity for a defect as a subjective,
not objective decision. This has proven to be a poor standard for good
metrics.
Is it really "six" sigma?
Let's return to the
metric itself. Once we've identified all of the appropriate
opportunities for defects and counted the actual number of them that
fail to meet specification, we're ready to calculate the metric. It's
trivial to determine the dpmo value, but what is the corresponding sigma
value? First, you'll have to find a table of values for the "one-sided
tail of a normal distribution." That should be easy, right?
Well,
they're not that easy to find. Most textbooks or statistics tables end
at values of three or four sigma. Why? My guess is that up until
recently there was little need for knowing values above these levels.
Practical applications simply did not exist in our world. There's
probably a profound message for us there, if we look carefully. I've
found such a table though in the 1992 Motorola Publication "Six Sigma
Producibility Analysis and Process Characterization" by Mikel J. Harry
and J. Ronald Lawson. Other more recent 6sigma sources always seem to
reference this one. Its Appendix C gives a value of 1.248x10-9 for
6sigma
But wait, what happened to the 3.4x10-6? Forgive my
cynicism, but here comes what looks to me like a little
"slight-of-hand." We are told that there is a typical 1.5 sigma
long-term drift in most process means. To adjust for it, we need to
subtract out this 1.5sigma, so that we actually use the table entry at
4.5 sigma to get to the adjusted short term value: that's 3.451x10-6.
In other words, if we measure 3451 defects in a billion opportunities,
only one of them was caused by short-term process variability. The
other 3450 were caused by this mysterious long-term drift in the mean,
so we're not going to count them. We'll report that our process is
operating at 6 sigma. Got it? To be honest though, in small print we
will admit to the 1.5 sigma adjustment, whether it's justifiable or not.
To make it easier for us, tables are provided that incorporate this
adjustment, with the obligatory footnote.
Well, I am aware of
situations where there is a drift in the mean, caused for example by
tool wear or component aging, but I also know of processes in which this
phenomenon simply does not occur. And, why forgive this long term
drift anyway, even when it does exist. Laser machining eliminates tool
wear; compensation circuits can adjust for component aging, and there's a
whole science of adaptive feedback systems that can sense and
compensate for various forms of both deterministic (like tool wear) as
well as random "non-stationarity," as the statisticians like to call
this drift. In a previous work-life, I spent many an evening atop
beautiful Mt. Haleakala in Hawaii peering through a large telescope at
satellites streaking across the sky. It was guided by a computerized
tracking system that effectively compensated for significant random
image wander created by the intervening atmospheric turbulence. So I
know first hand that it can be done.
Furthermore, there is a
conceptual problem created by the assumption that there is a constant
relationship between long term drift in the mean and short term process
variation. It implies that they both have a common root cause. I can
think of no theoretical reason why that should be true in any given
case, let alone be true in general. If instead it's based on empirical
observation, than I'd like to see the supporting data so I can draw my
own conclusion as to its general validity. It seems to me that this
largely undocumented long term drift in the mean is as worthy a target
for process improvement as is reduction in short-term variation. And I
don't buy the argument that it's too complicated in general to analyze,
so we'll just use a universal approximation. Too much very valuable
information is buried by that concession, not to mention the undesirable
behavior that it all too often encourages.
My cynical symbiont would
have loved to have been a fly-on-the-wall, when this convenient
"discovery" was made. Why convenient? Well, think about it. If each
unit produced has 100 opportunities for independent defects, then
without this 1.5 sigma adjustment 6 sigma quality would mean that you
would have only one defective unit in 10 million output units produced!
Banks would never make an error in processing loan applications,
semiconductor manufacturers would produce many products that never have
even a single defect throughout the product's entire lifecycle, and call
centers would correctly transfer each and every call the first time and
maintain this perfect performance over many decades. For nearly all
processes, that would be indistinguishable from the already un-sellable
concept of zero defects as a reasonable achievable goal.
Is 6
sigma a good goal for ALL processes?
So I for one don't buy this
1.5 sigma "free bonus" even in cases where it may exist. But there are
other critical problems with the 6 sigma goal. I've argued repeatedly
that each metric has a limiting value determined by the process's
enabling technology and organizational structure. Absent process
re-design, nothing can be done to reduce the sigma level below this
limiting value or entitlement on a permanent basis. Individual heroics
can create short-term gains beyond this limit (as evidenced by the
well-known Hawthorne Effect), but they are not sustainable in the
long-term.
The goal of 6 sigma for all processes requires an
organizational commitment to continuously re-design every one of them
before their limit is approached. Not only must the financial
commitment be there, but also the required new enabling technology and
organizational flexibility. In many situations, these commitments are
unrealistic, unreasonable and/or unsound. My personal bias is to focus
on metrics that address the gap between current and potential
performance and focus on the rate at which that gap is closing (see my
publications on the half-life method, for example).
Consider also an
old saying that we have in the System Dynamics world: "things get worse
before they get better." Its origins lie in the observation that major
changes usually create short-term disruptions that adversely affect
current performance. Process redesign almost always displays this
dynamic. If you are being rewarded on your 6 sigma performance, past
experience will discourage you from self-initiating a process re-design
since there is a good chance that it will initially blow your 6 sigma
performance. Short term special dispensation from the 6 sigma goal may
be a prerequisite for justifiable process redesign.
Furthermore,
increasing technical and organizational complexity slow the rate of
process improvement. Combine this with the observation that complex
processes tend to have long cycle times compared to the time it takes
for unpredictable changes to occur in their environment and you're
quickly led to the conclusion that many important processes can never
achieve 6 sigma performance unless they are dysfunctionally
over-simplified. This is how chaos theory enters the picture. My view
is that only routine, mature, and very high unit volume processes should
even be considered as potential candidates to have 6 sigma as a goal.
Set
a goal of 6 sigma to drive desired changes in the wrong processes and
you will only stifle innovation and encourage conservativism and
sub-optimization. Innovation and uncertainty are inexorable partners.
I've seen new product development efforts seriously undermined as a
result this type of phenomenon. Instead, if you must, set a process
goal of x sigma, where x is dependant on process complexity and
maturity. I would speculate that x=3 might be closer to the right
number for many important processes.
Another related perspective
on this issue is in terms of process learning. As a process approaches
its limiting performance, learning declines in absolute terms. An
organization which has achieved 6 sigma in all of its processes is an
organization that has, in this sense, stopped learning. In all cases
that I can think of, when you stop learning, you stop competing and we
all know where that feedback loop leads.
What is the real
effect on the bottom line?
Six Sigma Quality is often touted on
the basis of its significant bottom line impact. Some claim more than
$1M per year per Black Belt in typical cost savings. For example,
according to one Motorola Six Sigma Presentation, in 1996 they achieved
5.6 sigma performance (up from 4.2 sigma in 1986), $16B in cumulative
manufacturing cost savings and a reduction in Cost of Poor Quality from
15% in 1986 to a little over 5% of sales in 1996. I'm not sure where
that number comes from nor where the billions of dollars in resulting
claimed savings went, but I'd really like to see an independent audit so
that I could understand the basic assumptions used.
I would
hope that the calculated savings net out the component of traditional
cost reduction, as captured, for example, by the historical cost
experience curve, so that the resulting number is truly reflective of
the incremental savings that are directly assignable to the 6 sigma
initiatives. It is always very tempting to attribute all benefits to
the current program, regardless of their true origins.
All too
often, these "cost savings" estimates fail to recognize that many
apparently variable costs are in fact fixed or semi-fixed. They don't
really go away, but instead move elsewhere in the organization, at least
for the short term. Another common practice is the inclusion of profit
from new revenue which will be generated by the resources (people,
equipment and facilities, for example) freed-up by the process
improvement. Unfortunately, these estimates seldom consider total
market potential or competitive dynamics. Furthermore, there is rarely a
closing of the loop to assure that the predicted savings were actually
achieved. I've heard more than one improvement team query their sponsor
with: "What level of savings are you looking for?" Not surprisingly
the chosen assumptions yield that desired answer.
I would not be
surprised at all to find that Darwinian rules develop over time for the
calculation of sigma levels in many organizations in order to assure
survival of only the fittest opportunities for inclusion. I've been
told of more than one case where a persistent defect has been dropped
from the calculation with the justification that "we can't be measured
on what we don't control." Try selling that argument to the customer.
What
is also perplexing is that over the last five years Motorola's stock
has not outperformed the aggregate Electronic Equipment Industry of
which it is a member. One senior quality executive at Motorola told me
that the bulk of the 6 sigma savings had to be passed on to customers in
the form of price reductions, so they do not appear on the bottom line.
These two observations suggest that Motorola's competitors have
realized similar performance improvements, with or without the benefit
of the six sigma approach.
Also keep in mind that cost reduction
by itself does not create significant societal wealth. Its principal
effect is to move wealth from one place to another. The improvement in
labor productivity only benefits society if there are value creating
alternatives available for the surplused capital and labor. Reduce
equipment and raw materials usage and you reduce the wealth of the
equipment and raw materials suppliers. Societal wealth is mostly
created on the revenue side of the equation; by the creation of new
outputs that are of value to people. But 6 sigma is of little use
there. Just try applying it to processes having a significant amount of
creative content like product development or R&D.
So my bottom
line is that the claimed financial benefits of improvement in the 6
sigma metric, are also unsubstantiated. This undermines the assertion
of its proponents that the results prove that the metric really works.
The true benefits of 6 sigma are shared in common with the other flavors
of TQM.
The hidden danger of the 6 sigma metric.
Why
is all of this important? You could argue that I'm nitpicking and that
the real value of 6 sigma is in the concept and approach, not the actual
metric. But, non-financial performance measures are increasingly
becoming an important consideration in individual's compensation and
promotion. Past performance along these dimensions even enters resource
allocation decisions. This arises from the over-riding objective of
metrics: to drive positive changes in individual and group behavior.
But, if the non-financial measures are inherently unsound, so too will
be the decisions to which they contribute. In my view, the 6 sigma
metric falls into this category of noise generating metrics.
The 6
sigma metric does have some redeeming characteristics though:
- It
is defect oriented.
- With the exception of identification of
opportunities for a defect, it is reasonably well documented.
However,
it has an overwhelming number of weaknesses as a metric. Let me
summarize them:
- Unless the opportunities are weighted by
importance to the customer, it can be a poor surrogate for customer
satisfaction because the metric can get better while customer
satisfaction gets worse. How? By improvement of one type of defect at
the numerical expense of a more important one (e.g. eliminate 10
unimportant defects while creating only 5 more important ones: net
result, an apparent improvement of 5, with an obvious reduction in
customer satisfaction). Note though that this refinement adversely
affects the metric's simplicity requirement.
- Anyone who has
taught the 6sigma metric can testify to its complexity, even when the
students are soon-to-be Black Belts. This complexity also violates the
KISS principle of good metrics.
- The 1.5sigma adjustment is
unsupported and clearly is case dependant at best, thus making the
metric inherently biased (it systematically overstates actual
performance).
- Because of its ambiguity, it is easily gamed
unless complimented by other, more valuable metrics. As a test, give two
groups of knowledgeable people the independent job of identifying the
opportunities for defects. It is likely that their lists will look very
different. Although it is often touted as a universal metric that
allows cross-process comparisons, this weakness significantly undermines
that potential.
- Although it looks like variable data, it is
based on attribute data (number of defects) which masks the degree to
which the individual specifications fail to meet customer requirements.
This breaks the link of the metric to its underlying root causes,
unless the associated variable data is also measured and reviewed.
-
It is based on the gap between current performance and zero defects
rather than the process's limiting value. In doing this it fails to
accommodate strategic decisions about process re-design priorities.
- As a goal, it fails to differentiate between processes of
different complexity and maturity. If fails to recognize the role of
chaos or exogenous unpredictability in some very important processes,
for example forecasting, product development, resource allocation and
strategic planning.
Bottom line: as far as the 6 sigma metric is
concerned, forget it. Calculate defects and defect rates, along with
their underlying variable data, but don't bother trying to convert them
to an arbitrary sigma value.
Even the tables are wrong
I
can't leave this subject without sharing with you a little twist of
irony for this metric.
The tail of the normal distribution can
not be evaluated in what mathematicians call "closed-form." That means
that you can't write an equation where you plug in dpmo and out comes
the sigma value. That's why we need to have those tables. And they can
only be evaluated using numerical integration techniques or finite
series approximations. When doing this, mathematicians know that it's
important to estimate the residual error so that you know the accuracy
of your estimate. But this is not always done.
Under the
circumstances, it is understandable that Motorola's numbers, for large
sigma values are not exactly correct: precise, yes, accurate, no.
Today, the wonders of modern personal computers make these calculation
accessible to everyone, including me*. For example the correct 6 sigma
value is actually 0.987x10-9, not 1.248x10-9. That's a mere 26% error!
Although the error decreases with decreasing sigma the correct value
at 4.5 sigma is actually 3.397x10-6, not 3.451x10-6 as published in the
Motorola table. That error is 1.59% which translates into 2.15 sigma
(now should I add that 1.5sigma or not?)! Yes, I know that there is no
practical difference between these two values. But remember, my point
is irony, not significance. The quoted 10sigma value of 6.216x10-21 is
in fact 7.62x10-24, or about 1000 times smaller!! I certainly hope
nobody's career depended on the accuracy of that one.
Conclusion
In
closing, don't get me wrong, I'm not saying that numerical goals,
variation reduction or DfX (aka Design for 6 sigma), where X stands for
the "abilities": producability, testability, maintainability,
serviceability, recyleability, etc., are unimportant. I have always
been a big fan of Armand Feigenbaum, who described most of the 6 sigma
statistical concepts in the 1950's. His classic book, Total Quality
Control became the bible and inspiration for the Japanese quality
movement and the source for the name TQC. What I am saying is that 6
sigma is a poor metric. So my advice: use Six Sigma as the name for
your version of TQM, but don't track its numerical value or put it on
your balanced scorecard.
* If you're interested in how I determined
the correct numbers, send me an e-mail. I'd be glad to send you the
formula I derived for large sigma and how I checked the results.
Article written by Arthur M. Schneiderman, Thought Leader and Proven Executive.