There is an older paper of Deeks et al justifying the use of the diagnostic odds ratio for this purpose:
on this method:
http://www.ncbi.nlm.nih.gov/pubmed/16085191
The work of Deeks et al is not related to the bivariate model. From my limited experience I can nevertheless recommend this approach.
The answers here are good, +1 to all. I just wanted to show how this effect might look in funnel plot terms in an extreme case. Below I simulate a small effect as $N(.01, .1)$ and draw samples between 2 and 2000 observations in size.
The grey points in the plot would not be published under a strict $p < .05$ regime. The grey line is a regression of effect size on sample size including the "bad p-value" studies, while the red one excludes these. The black line shows the true effect.
As you can see, under publication bias there is a strong tendency for small studies to overestimate effect sizes and for the larger ones to report effect sizes closer to the truth.
set.seed(20-02-19)
n_studies <- 1000
sample_size <- sample(2:2000, n_studies, replace=T)
studies <- plyr::aaply(sample_size, 1, function(size) {
dat <- rnorm(size, mean = .01, sd = .1)
c(effect_size=mean(dat), p_value=t.test(dat)$p.value)
})
studies <- cbind(studies, sample_size=log(sample_size))
include <- studies[, "p_value"] < .05
plot(studies[, "sample_size"], studies[, "effect_size"],
xlab = "log(sample size)", ylab="effect size",
col=ifelse(include, "black", "grey"), pch=20)
lines(lowess(x = studies[, "sample_size"], studies[, "effect_size"]), col="grey", lwd=2)
lines(lowess(x = studies[include, "sample_size"], studies[include, "effect_size"]), col="red", lwd=2)
abline(h=.01)
Created on 2019-02-20 by the reprex package (v0.2.1)
Best Answer
This question could be interpreted in one of two ways. You could be asking whether there are requirements for subjective evaluations of funnel plots to be considered valid (i.e., looking at a funnel plot and coming to your own conclusion about the presence/absence of small-study effects). Or, alternatively, you could be asking whether there are requirements for empirical evaluations of funnel plots to be considered valid (i.e., using some sort of statistical test for the presence of small-study effects/publication bias).
mdewey's answer is clearly based on the first interpretation--they offer a heuristic of no less than 10 studies for subjective interpretations of the funnel plot to be valid. I have two problems with this interpretation (to be clear: not mdewey's answer, given the interpretation, but rather, the interpretation itself). First, subjective interpretations of funnel plots can become increasingly difficult to make as the number of studies in the plot increase. For example, with the funnel plot below (~30 effect sizes?) it is pretty easy to visibly detect a superficial amount of asymmetry in the funnel plot.
But contrast that plot against one from a much larger meta-analysis (one of my own, below), and you can see that it can become quite unclear how one should interpret a funnel plot when there are many (in this case, 245) effect sizes.
My second objection to interpreting your question as pertaining to subjective assessment of funnel plots is that I simply struggle to understand how one would define "validity" when the method of inspection is subjective- how can one verify/falsify that their interpretation of the funnel plot is legitimate when there are no empirical criteria being used?
So, I interpret your question as being about empirical evaluations of the funnel plot, and indeed, there are a number of statistical tests you can use to determine whether the funnel plot is significantly asymmetrical. And yes, as you intuit, a number of factors (including the number of effect sizes, the precision of the effect sizes, the variability of effect sizes, whether the effect sizes are heterogeneous, etc.,) impact the validity of these tests. I've included a number of references, below, to simulations of different methods of testing funnel plot asymmetry; most compare methods in terms of their power to detect asymmetry, and their false-positive error rates.
None of these simulations appear to espouse number-of-study guidelines for testing funnel plot asymmetry. A simulation by Sterne, Gavaghan, and Egger (2000), however, does give calculated power levels for a regression method and a correlation method, with varied numbers of studies. The Coles/Kliff notes summary of this simulation is that the authors find that only when there is severe bias(read: easily detected small-study effects), and the number of studies is ~20, does the regression test become adequately powered (power ~.80) to detect funnel plot asymmetry. Thus, you probably need at least 20 studies to detect funnel plot asymmetry, and likely will need more unless your data conform to the most optimal conditions for detecting small-study effects. With 10 studies, alternatively, you're only looking at power of ~.30~.50, depending on which test you use (and again, assuming severe bias).
Have a look at the papers too, to see conditions under which false-positives become higher-than-desirable--some methods are better than others for maintaining your false-positive error rate. I'm not sure about other software, but I know the metafor package for R offers users the ability to employ a number of these different methods, depending on their preference (see documentation for the
regtest
command).References
Harbord, R. M., Egger, M., & Sterne, J. A. C. (2006). A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Statistics in Medicine, 25, 3443-3457.
Moreno, S. G., Sutton, A. J., Ades, A. E., Stanley, T. D., Abrams, K. R., Peters, J. L., & Cooper, N. J. (2009). Assessment of regression-based methods to adjust for publication bias through a comprehensive simulation study. BioMed Central Medical Research Methodology, 9, 1-17.
Peters, J. L., Sutton, A. J., Jones, D. R., Abrams, K. R., & Rushton, L. (2006). Comparison of two methods to detect publication bias in meta-analysis. Journal of the American Medical Association, 295, 676-680.
Sterne, J. A. C., Gavaghan, D., & Egger, M. (2000). Publication and related bias in meta-analysis: Power of statistical tests and prevalence in the literature. Journal of Clinical Epidemiology, 53, 1119-1129.