# The Surprising Source of Variation and How to Reduce It

August 19, 2019
General

I recently wrote a series of posts on Measurement Studies for the PQ Systems blog:

The next step is analyzing the data.

To illustrate, we will use an example R&R study conducted on the length of molded plastic parts. The study used five samples, and the testers measured the length of each sample twice. Here are the results.

A quick review of the data before doing any analysis reveals some interesting variation in the results. How do we decide if this variation is significant or if it is acceptable? It depends on what you are measuring. It is best to compare the variation found in the study to the variation in the product specifications. To do this, we use a statistic called R&R. R&R stands for repeatability and reproducibility:

Repeatability: The ability of one tester using the same equipment and the same sample to get the same number or value each time the sample is measured. It takes no account of the difference in testers, so it is also known as the variation in the equipment.

Reproducibility: The ability of different testers to produce the same number or value using the same equipment and the same sample. Reproducibility is also known as tester or appraiser variation.

Calculating R&R is relatively complex but can be broken down into several simple steps using GAGEpack. Here are the results for the data shown above.

What does the percentage mean? It is the percentage of the specification taken up by measurement variation; it is the width of the variation curve for the measurement system. In this example, 98.59% of the specification is measurement variation—and that’s before we look at any variation in the process! Remember, when you are looking at data from the process you are looking at:

Variation in data = measurement variation + process variation

When we observe variation in data, we tend to assume that it is all coming from the process, however, it is often coming from the measurement system. If the length results measured by the equipment used in the above example are analyzed, the origin of this variation will be the measurement system, NOT the process. Therefore, to improve the capability of the process, it is necessary to improve the measurement system and leave the production process alone.

What value are we looking for in the R&R? Here’s a guideline:

• Greater than 30%: Not acceptable. The measurement system will not reliably distinguish good product from bad.
• Between 10 and 30%: Acceptable for normal measurement.
• Less than 10%: Acceptable for use with statistical process control.

Note the stricter requirement for use with statistical process control. If the value is greater than 10%, then many of the trends and spikes found in control charts will be caused by the measurement system, NOT the process. So, changing the production process will only make things worse!

In our example, the R&R is 98.59%, meaning the measurement system variation comprises most of the variation in the data. The process has to be excellent under these circumstances; otherwise, any slight change in the process will cause results out of spec due to the poor measurement system. When a measurement system has an R&R above 30%, it questions its ability to detect whether product is in or out of specification.

How do you go about reducing the R&R%? That depends on the source of most of the variation. In this example, the equipment variation is slightly higher than the appraiser or tester variation.

If the repeatability or equipment variation is significant, it is an indication that the equipment or test method is unsuitable at this stage. In extreme cases, it may be necessary to replace the equipment. Alternatively, it may only require an adjustment in the test equipment or the method of using the equipment. Investigate how the equipment is used in detail and improve any potential causes of variation.

If the reproducibility or the tester variation is high, the data from the study can be analyzed further to understand the causes. Is it one individual tester, is it a bias issue, or was a particular sample difficult to measure?