Solved – Comparing the mean difference between data measured by different equipment, t-test suitable

equivalencemeanmeasurementt-test

My problem is as follows:

I have 2 measuring devices, both using different measuring techniques, each trying to quantify properties of objects passing their measurement area. The devices measure the objects at the same time. For this particular problem I'm only interested in the number of objects each device is able to measure.

For each time unit, which is set to one hour, the devices summarize the number of objects they've measured or "seen", like the below example data:

Time    Device1 Device2
00:00   58      47
01:00   38      52
02:00   12      13
03:00   0       2
04:00   23      2
....    ..      ..

I want to compare the mean difference of these 2 datasets

The first thing that came to my mind was to use a 2-tailed t-test to determine if there is a significant difference in means between the two datasets. Using the following parameters in Excel:

Hypothetized Mean Difference = 0
Alpha = 0,05

Question: Is the t-test even suitable in this particular setup? And if it is, is my data paired or independent? If not suitable, what other method could be used?

I guess what confuses me is the fact that I have 2 measuring devices measuring the same event, instead of one device measuring different events. I haven't dealt with statistical problems in ages, so my knowledge is a bit rusty. Any help would be greatly appreciated.

Edit: As has been pointed out in the answers, the question that I should be asking is: do the devices produce equivalent measurements? For clarification, the example data provided is fake, but closely resembles the measured data. The measured data is not normally distributed.

Best Answer

The issue is less with can one perform the statistical test (paired t test) for difference?, and more with what is the inference you are drawing from the tests results?

From what you describe, both measurements are made on the same objects at the same time, your test is trying to support inference for whether or not the equipment are making different measurements. A few thoughts:

If you are interested in whether or not the two equipments make equivalent measurements—a separate statistical question than whether they make different measurements—you should also be performing equivalence tests, and combining your inferences from both tests.
If you are interested in a deeper understanding of how the two measures perform, perhaps a regression model, such as OLS regression, which will tell you trend and strength of association, as well as evidence of association existing.
From the tiny picture you give, your data do not really look normally distributed, which is one of the t test's assumptions. Perhaps a distribution-free test, such as the sign rank test would be appropriate?

Related Solutions

Solved – Meaning of standard deviation of the mean difference

Think of the difference like any other statistic that you are collecting. These differences are just some values that you have recorded. You calculate their mean and standard deviation to understand how they are spread (for example, in relation to 0) in a unit-independent fashion.

The usefulness of the SD is in its popularity -- if you tell me your mean and SD, I have a better understanding of the data than if you tell me the results of a TOST that I would have to look up first.

Also, I'm not sure how the difference and its SD relate to a correlation coefficient (I assume that you refer to the correlation between two variables for which you also calculate the pairwise differences). These are two very different things. You can have no correlation but a significant MD, or vice versa, or both, or none.

By the way, do you mean the standard deviation of the mean difference or standard deviation of the difference?

Update

OK, so what is the difference between SD of the difference and SD of the mean?

The former tells you something about how the measurements are spread; it is an estimator of the SD in the population. That is, when you do a single measurement in A and in B, how much will the difference A-B vary around its mean?

The latter tells us something about how well you were able to estimate the mean difference between the machines. This is why "standard difference of the mean" is sometimes referred to as "standard error of the mean". It depends on how many measurements you have performed: Since you divide by $\sqrt{n}$, the more measurements you have, the smaller the value of the SD of the mean difference will be.

SD of the difference will answer the question "how much does the discrepancy between A and B vary (in reality) between measurements"?

SD of the mean difference will answer the question "how confident are you about the mean difference you have measured"? (Then again, I think confidence intervals would be more appropriate.)

So depending on the context of your work, the latter might be more relevant for the reader. "Oh" - so the reviewer thinks - "they found that the difference between A and B is x. Are they sure about that? What is the SD of the mean difference?"

There is also a second reason to include this value. You see, if reporting a certain statistic in a certain field is common, it is a dumb thing to not report it, because not reporting it raises questions in the reviewer's mind whether you are not hiding something. But you are free to comment on the usefulness of this value.

Solved – How to compare measurements and uncertainties made with different measuring instruments

The model you use to "simulate your problem" can be used almost verbatim to estimate the parameters you are interested in using Bayesian estimation. Here is the model I'll use (using the same notation as you):

$$ L_B \sim \mathrm{Normal}(\mu, \sigma) \\ x_i \sim \mathrm{Normal}(\mu, \sigma) \mathrm{\ for\ i\ from\ 1\ to\ N} \\ L_{Ai} \sim \mathrm{Normal}(x_i \cdot \mathrm{gain} - \mathrm{offset}, \mathrm{dispersion}) \mathrm{\ for\ i\ from\ 1\ to\ N} \\ $$

The glaring omision in this model compared to your problem is that I don't include the assumption that some of the same $x_i$s that got measured by B could then be measured again by A. This could probably be added, but I'm not completely sure how.

This model is implemented in R & JAGS below using very vague, almost flat priors, the data used is the one you generated in your question:

library(rjags)

model_string <- "model{
  for(i in 1:length(L_B)) {
  L_B[i] ~ dnorm(mu, inv_sigma2) # <- reparameterizing sigma into precision 
                                 #    needed because of JAGS/BUGS legacy.  
  }
  for(i in 1:length(L_A)) {
    x[i] ~ dnorm(mu, inv_sigma2)
    L_A[i] ~ dnorm(gain * x[i] - offset , inv_dispersion2)
  }

  mu ~ dnorm(0, 0.00001)
  inv_sigma2 ~ dgamma(0.0001, 0.0001) 
  sigma <- sqrt(1 / inv_sigma2)
  gain ~ dnorm(0, 0.00001) T(0,)
  offset ~ dnorm(0, 0.00001)
  inv_dispersion2 ~ dgamma(0.0001, 0.0001)
  dispersion <- sqrt(1 / inv_dispersion2)
}"

Let's run it and see how well it does:

model <- jags.model(textConnection(model_string), list(L_A = L_A, L_B = L_B), n.chains=3)
update(model, 3000)
mcmc_samples <- coda.samples(model, c("mu", "sigma", "gain", "offset", "dispersion"), 200000, thin=100)
apply(as.matrix(mcmc_samples), 2, quantile, c(0.025, 0.5, 0.975))
##       dispersion   gain      mu   offset  sigma
## 2.5%     0.01057 0.1366 -0.3116 -0.51836 0.9365
## 50%      0.18657 1.0745 -0.1099 -0.26950 1.0675
## 97.5%    1.20153 1.2846  0.1051 -0.04409 1.2433

The resulting estimates are reasonably close to the values you used when you generated the data:

c(gain_A, offset_A, dispersion_A)
## [1]  1.1 -0.2  0.5

...except for, perhaps, dispersion. But with more data, perhaps more informed priors and running the MCMC sampling longer this estimate should be better.

Best Answer

Related Solutions

Solved – Meaning of standard deviation of the mean difference

Solved – How to compare measurements and uncertainties made with different measuring instruments

Related Question