Solved – How to calculate the error of percentages

proportion;standard deviationstandard error

Is it correct to calculate the standard deviation of percentages/proportions as you would for four numbers (i.e. non-percentages)?

Thanks for your help @user2974951 @whuber and @asdf. Yes, you're right I was alluding to using the common SD formula. To explain further, my four percentages are from an experiment where we measured the uptake of a chemical by 4 plants. Each plant took up between 40% and 55% of what was applied, therefore the percentage can't ever be >100%. So, if I understand correctly I will need to use the proportion SD which is different to the common SD formula? Thanks for the link to the thread @user2974951

Best Answer

It depends. If you are just expressing a number that can take any value as a percent (for example, relative growth of number of sales), then go ahead! -15% is just -0.15, 133% is just 1.33 and so on.

However, if what you're dealing with is a proportion (as in "73% of students passed the test, ie: a value between 0% and 100%), then you should calculate standard deviation from the binomial distribution, in other words, for a population of n and a proportion of p (measured from 0 to 1), your variance is np(1-p), and your standard deviation is the square root of that number

Related Solutions

Generalized Linear Model – Comparing Two Groups with Many Zeros

If your data seem overdispersed you should apply Negative Binomial distribution, with its additional parameter to account for that. However, this normally applies to count data, not proportions.

The use of the Zero-Inflated Negative Binomial model in R is examplified here: Fitting a zero-inflated negative binomial regression with R

The theory is explained here: https://stats.idre.ucla.edu/r/dae/zinb/

Both dichotomic and time parts should initially consist of only the intercept term. You should then perform the Likelihood Ratio test to see if the Ship presence factor contributed to reduce the deviation significantly.

Proportion Statistics – Calculating Standard Error of Difference Between Two Independent Proportions

Confidence intervals for men and women separately. Let capital letters denote estimates: The point estimates for the proportions of men and of woman who agree are $0.264$ and $0.323,$ respectively. The corresponding Wald confidence intervals, based on normal approximation, are $(0.225, 0.303)$ for men and $(0.288, 0.357)$ for women.

P.m = 132/500;  P.m
[1] 0.264
SE.m = sqrt(P.m*(1-P.m)/500);  SE.m
[1] 0.01971314
CI.m = P.m + qnorm(c(.025,.975))*SE.m
CI.m
[1] 0.225363 0.302637


P.w = 226/700; P.w
[1] 0.3228571
SE.w = sqrt(P.w*(1-P.w)/700);  SE.w
[1] 0.01767243
CI.w = P.w + qnorm(c(.025,.975))*SE.w
CI.w
[1] 0.2882198 0.3574945

The procedure binom.test in R gives (slightly different) exact binomial confidence intervals, $(0.226, 0.305)$ for men and $(0.288, 0.359)$ for women, as shown below.

binom.test(132,500)$conf.int
[1] 0.2258560 0.3049604
attr(,"conf.level")
[1] 0.95
binom.test(226,700)$conf.int
[1] 0.2883144 0.3589013
attr(,"conf.level")
[1] 0.95

CI for the difference between men and women.

As in the last section of the table in your question and in @Ben's Answer (+1), the (estimated) standard error for the difference P_w - P_m is as follows:

SE.d = sqrt(SE.w^2 + SE.m^2);  SE.d
[1] 0.02647495

Then a 95% confidence interval for the difference, based on normal approximations, is (0.0070, 0.1107),$ which is essentially the same

P.w-P.m + qnorm(c(.025,.975))*SE.d
[1] 0.006967198 0.110747087

In R the procedure prop.test gives the same 95% confidence interval $(0.0070, 0.1107).$ [This interval is also based on a normal approximation; the continuity correction was declined on account of the large sample sizes.]

prop.test(c(226,132), c(700,500), cor=F)$conf.int
[1] 0.006967198 0.110747087
attr(,"conf.level")
[1] 0.95

Notice that this 95% confidence interval does not include $0.$ Accordingly, a test the women and men have equally favorable opinions is rejected at the 5% level (against the two-sided alternative). The P-value of the (approximate normal) test is $0.028 < 0.05 = 5\%.$

prop.test(c(226,132), c(700,500), cor=F)$p.val
[1] 0.02802182

Best Answer

Related Solutions

Generalized Linear Model – Comparing Two Groups with Many Zeros

Proportion Statistics – Calculating Standard Error of Difference Between Two Independent Proportions

Related Question