Solved – Statistical test to compare precision of two devices

repeated measuresstatistical significancevariance

I am comparing two temperature control devices both designed to maintain body temperature at exactly 37 degrees in anaesthetised patients. The devices were fitted to 500 patients forming two groups. Group A (400 patients) – Device 1, Group B (100 patients)- Device 2. Each patient had their temperature measured once every hour for 36 hours, giving me 18000 data points across two groups. I need to determine which device controls the patients' body temperature more precisely over the 36 hour period.
I have constructed line graphs joining the median values at each time point with quartile bars and visually there seems to be a difference.
How should I be analysing my data to prove a statistical difference?

Best Answer

The first thing you will need to think about is what it means (quantitatively) to have "good precision" in such a device. I would suggest that, in a medical context, the goal is to avoid temperature deviations that go into a dangerous range for the patient, so "good precision" is probably going to translate into avoiding dangerously low or high temperatures. This means you are going to be looking for a metric that heavily penalises large deviations from your optimal temperature of 37$^\text{o}$C. In view of this, measurement based on fluctuations in median temperatures is going to be a bad measure of precision, whereas measures that highlight large deviations will be better.

When you are formulating this kind of metric, you are implicitly adopting a "penalty function" that penalises temperatures that deviate from your desired temperature. One option would be to measure "precision" by lower variance around the desired temperature (treating this as the fixed mean for the variance calculation). The variance penalises by squared error, so that gives reasonable penalisation for high deviations. Another option would be to penalise more heavily (e.g., cubed-error). Another option would be to simply measure the amount of time each device has the patient outside the temperature range that is medically safe. In any case, whatever you choose should reflect the perceived dangers of deviation from the desired temperature.

Once you have determined what constitutes a metric of "good precision", you are going to be formulating some kind of "heteroscedasticity test", formulated in the wider sense of allowing whatever measure of precision you are using. I'm not sure I agree with whuber's comment of adjusting for autocorrelation. It really depends on your formulation of loss - after all, staying in a high temperature range for an extended period of time could be exactly the thing that is the most dangerous, so if you adjust back to account for auto-correlation, you might end up failing to penalise highly dangerous outcomes sufficiently.

Related Solutions

Solved – Statistical test to compare two ratios from two independent models

In response to an old question, and given that a good response has been provided already elsewhere by jbowman and StasK to a very similar (but better defined) problem. I refer anyone who stumbles on this to the following question (and answers): Test for significant difference in ratios of normally distributed random variables

The permutations test should be easy to implement in most statistical tools and many programming languages. Additionally, it doesn't assume that you have count data but means that you can use a ratio of rates or other appropriate metrics.

Statistical Significance – Developing a Statistical Test to Distinguish Two Products Based on Categorical Data

For ranking by different judges, one can use the Friedman test. http://en.wikipedia.org/wiki/Friedman_test

You may convert ratings from very bad to very good to numerics of -2, -1, 0, 1 and 2. Then put data in long form and apply friedman.test with customer as the blocking factor:

> mm
   customer variable value
1         1 product1     2
2         2 product1     1
3         3 product1     0
4         4 product1     2
5         5 product1    -1
6         6 product1     0
7         7 product1    -1
8         8 product1     2
9         9 product1     1
10       10 product1     1
11       11 product1     0
12       12 product1     2
13       13 product1     1
14       14 product1     2
15       15 product1     2
16        1 product2    -2
17        2 product2    -1
18        3 product2    -1
19        4 product2     0
20        5 product2     2
21        6 product2     1
22        7 product2     0
23        8 product2    -2
24        9 product2     1
25       10 product2     2
26       11 product2     0
27       12 product2     1
28       13 product2     1
29       14 product2     0
30       15 product2     0
> 
> friedman.test(value~variable|customer, data=mm)

        Friedman rank sum test

data:  value and variable and customer
Friedman chi-squared = 1.3333, df = 1, p-value = 0.2482

The ranking of the difference between 2 products is not significant.

Edit:

Following is the output of regression:

> summary(lm(value~variable+factor(customer), data=mm))

Call:
lm(formula = value ~ variable + factor(customer), data = mm)

Residuals:
   Min     1Q Median     3Q    Max 
  -1.9   -0.6    0.0    0.6    1.9 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)
(Intercept)         4.000e-01  9.990e-01   0.400    0.695
variableproduct2   -8.000e-01  4.995e-01  -1.602    0.132
factor(customer)2   6.248e-16  1.368e+00   0.000    1.000
factor(customer)3  -5.000e-01  1.368e+00  -0.365    0.720
factor(customer)4   1.000e+00  1.368e+00   0.731    0.477
factor(customer)5   5.000e-01  1.368e+00   0.365    0.720
factor(customer)6   5.000e-01  1.368e+00   0.365    0.720
factor(customer)7  -5.000e-01  1.368e+00  -0.365    0.720
factor(customer)8   9.645e-16  1.368e+00   0.000    1.000
factor(customer)9   1.000e+00  1.368e+00   0.731    0.477
factor(customer)10  1.500e+00  1.368e+00   1.096    0.291
factor(customer)11  7.581e-16  1.368e+00   0.000    1.000
factor(customer)12  1.500e+00  1.368e+00   1.096    0.291
factor(customer)13  1.000e+00  1.368e+00   0.731    0.477
factor(customer)14  1.000e+00  1.368e+00   0.731    0.477
factor(customer)15  1.000e+00  1.368e+00   0.731    0.477

Residual standard error: 1.368 on 14 degrees of freedom
Multiple R-squared:  0.3972,    Adjusted R-squared:  -0.2486 
F-statistic: 0.6151 on 15 and 14 DF,  p-value: 0.8194

enter image description here

Best Answer

Related Solutions

Solved – Statistical test to compare two ratios from two independent models

Statistical Significance – Developing a Statistical Test to Distinguish Two Products Based on Categorical Data

Related Question