Solved – Bootstrapping in SEM when the original sample size is small

bootstrapstructural-equation-modeling

I'm running a SEM in which I have several very positively skewed endogenous variables. Unfortunately, even when I log transform these variables they are still quite non-normal. Kline (2011) p64 writes that "Some distributions can be so severely non-normal that basically no transformation will work", and so I gave up trying to transform them more.

Kline (2011) provides another option on p177-178:

[An] option for analyzing continuous but severely non-normal
endogenous variables is to use a normal theory method (i.e., ML
estimation) but with nonparametric bootstrapping, which assumes only
that the population and sample distributions have the same shape. In a
bootstrap approach, parameters, standard errors, and model test
statistics are estimated with empirical sampling distributions from
large numbers of generated samples. Results of a computer simulation
study by Nevitt and Hancock (2001) indicate that bootstrap estimates
for a measurement model were generally less biased compared with those
from standard ML estimation under conditions of non-normality and for
sample sizes of N ≥ 200. For N = 100, however, bootstrapped estimates
had relatively large standard errors, and many generated samples were
unusable due to problems such as nonpositive definite covariance
matrices. These problems are consistent with the caution by Yung and
Bentler (1996) that a small sample size will not typically render
accurate bootstrapped results.

In this answer @michael-chernick writes that

The theory of the bootstrap involves showing consistency of the
estimate. So it can be shown in theory that it works for large
samples. But it can also work in small samples. I have seen it work
for classification error rate estimation particularly well in small
sample sizes such as 20 for bivariate data.

Some questions:
1. Those two quotes seem at face value to be in conflict with each other, but I note that @michael-chernick was answering a question that did not involve SEM. Does SEM require a larger original sample size for successful bootstrapping? If so, why?
2. If Kline is right that bootstrapping will work poorly with a low N, what should I do if I have a low N? Say for the sake of argument I have a sample size of 100 and no way to collect more data. Should I go ahead with bootstrapping, and if so should I bootstrap using the transformed variable (remembering that the transformation didn't do a great job of resolving the non-normality) or the original variable?

Kline, R. B. (2011). Principles and practice of structural equation modeling. Guilford publications. Chicago

Yung, Y. F., & Bentler, P. M. (1996). Bootstrapping techniques in analysis of mean and covariance structures. Advanced structural equation modeling: Issues and techniques, 195-226.

Best Answer

Yes, SEM requires a larger size. The reason being that SEM is doing two things: First, it's trying to find a model, and then it finds the standard errors of that model.

There are two problems. The first is that you will have trouble estimating the model(s).

If you have problems with your standard errors (because, say, of non-normality) then bootstrapping might help you. But if you try to run a SEM model with a small sample size, you'll find that you don't get a sensible model to interpret - the model will frequently not converge, or converge with out of bounds estimates (variances < 0; correlations > 1 [perhaps MUCH greater than one - one sometimes sees correlations that are in the three digit range]).

So when you try to bootstrap a model with a small sample size you might find that 25% of the bootstrap samples are clearly wacky and should be discarded. And some proportion of the rest are also wacky, but you don't have a good way to decide which ones. If you did, you could go ahead and use the standard errors.

The second problem is that ML tends to be biased in small samples.

Related Solutions

Solved – Complications of having a very small sample in a structural equation model

One point: there is no such thing as a "basic question", you only know what you know, and not what you don't know. asking a question is often the only way to find out.

Whenever you see small samples, you find out who really has "faith" in their models and who doesn't. I say this because small samples is usually where models have the biggest impact.

Being a keen (psycho?) modeller myself, I say go for it! You seem to be adopting a cautious approach, and you have acknowledged potential bias, etc. due to small sample. One thing to keep in mind with fitting models to small data is that you have 12 variables. Now you should think - how well could any model with 12 variables be determined by 42 observations? If you had 42 variables, then any model could be perfectly fit to those 42 observations (loosely speaking), so your case is not too far from being too flexible. What happens when your model is too flexible? It tends to fit the noise - that is, the relationships which are determined by things other than the ones you hypothesize.

You also have the opportunity to put your ego where your model is by predicting what those future 10-20 samples will be from your model. I wonder how your critics will react to a so called "dodgy" model which gives the right predictions. Note that you would get a similar "I told you so" if your model doesn't predict the data well.

Another way you could assure yourself that your results are reliable, is to try and break them. Keeping your original data intact, create a new data set, and see what you have to do to this new data set in order to make your SEM results seem ridiculous. Then look at what you had to do, and consider: is this a reasonable scenario? Does my "ridiculous" data resemble a genuine possibility? If you have to take your data to ridiculous territory in order to produce ridiculous results, it provides some assurance (heuristic, not formal) that your method is sound.

Solved – Can bootstrap be seen as a “cure” for the small sample size

I remember reading that using the percentile confidence interval for bootstrapping is equivalent to using a Z interval instead of a T interval and using $n$ instead of $n-1$ for the denominator. Unfortunately I don't remember where I read this and could not find a reference in my quick searches. These differences don't matter much when n is large (and the advantages of the bootstrap outweigh these minor problems when $n$ is large), but with small $n$ this can cause problems. Here is some R code to simulate and compare:

simfun <- function(n=5) {
    x <- rnorm(n)
    m.x <- mean(x)
    s.x <- sd(x)
    z <- m.x/(1/sqrt(n))
    t <- m.x/(s.x/sqrt(n))
    b <- replicate(10000, mean(sample(x, replace=TRUE)))
    c( t=abs(t) > qt(0.975,n-1), z=abs(z) > qnorm(0.975),
        z2 = abs(t) > qnorm(0.975), 
        b= (0 < quantile(b, 0.025)) | (0 > quantile(b, 0.975))
     )
}

out <- replicate(10000, simfun())
rowMeans(out)

My results for one run are:

     t      z     z2 b.2.5% 
0.0486 0.0493 0.1199 0.1631

So we can see that using the t-test and the z-test (with the true population standard deviation) both give a type I error rate that is essentially $\alpha$ as designed. The improper z test (dividing by sample standard deviation, but using Z critical value instead of T) rejects the null more than twice as often as it should. Now to the bootstrap, it is rejecting the null 3 times as often as it should (looking if 0, the true mean, is in the interval or not), so for this small sample size the simple bootstrap is not sized properly and therefore does not fix problems (and this is when the data is optimally normal). The improved bootstrap intervals (BCa etc.) will probably do better, but this should raise some concern about using bootstrapping as a panacea for small sample sizes.

Best Answer

Related Solutions

Solved – Complications of having a very small sample in a structural equation model

Solved – Can bootstrap be seen as a “cure” for the small sample size

Related Question