Solved – Using control charts with very large subgroup size

quality control

I am working with a very large data set — time-series data from an on-line process monitor with a 10 second measurement interval. I am trying to develop control limits for the process using control charts theory.

Here are the methods I have tried so far and the results:

1) Using an I-MR type chart: computing the control limits using an I-MR chart results in most of the data being out of control. I think the problem is that the time-scale over which the readings change significantly is hours but the range (and ultimately the control limit) is calculated from the differences between two consecutive data points, which are very small.

2) Subgrouping the data and using Xbar-R chart: If I sub-group the data into one-minute groups (i.e. n=6), and compute the control limits, I have the same problem as with the I-MR chart — all the data is out of control. Same problem as above, the data doesn't change significantly from minute to minute.

3) Moving average over a longer time period: If I compute a moving average and range over several hours, the average range becomes much more realistic for the data set. However, I'm not sure how to actually calculate the real control limits because my subgroup size is so large (n=1440) and the tables of correction factors only go up to n=15.

Is it appropriate to have such a large subgroup size, and if so, what is the correct method of calculating control limits?

If it is not appropriate, what would be the correct way to calculate realistic control limits for this process?

Best Answer

There are tables which provide factors out to subgroups of 25 for most Shewhart charts. There are also equations available for the various factors, but for subgroups of 1440, the calculations for the factors may simply not be worth it. For example, the equations for two of the factors are presented in this answer.

If a Range chart with averages over several hours works best, then randomly sample 15–25 units out of the 1440 available and construct an $\bar{x}-R$ chart with these samples. You may even find random samples of 15–25 units may be acceptable on an hourly basis.

Not every data point that is produced needs to be calculated. "If the $\bar{x}$ chart is being used to detect moderate-to-large process shifts, say on the order of $2\sigma$ or larger, then relatively small sample size $n=4, 5,$ or $6$ are reasonably effective. On the other hand, if we are trying to detect small shifts, then larger sample sizes of possibly $n=15$ to $n=25$ are needed."—D.C. Montgomery, Introduction to Statistical Quality Control Based upon the need, samples are selected at random or at regular intervals and grouped over the desired amount of time.

Control limits are then calculated in the normal way based upon the sampled data.

The other thing to consider is your data type or if changing data types can prove more useful. For example, $c$ charts or $p$ charts may have the limits you need, and subgrouped into attribute data counts or percentages of 1440 could be entirely appropriate.

Related Question