If your data seem overdispersed you should apply Negative Binomial distribution, with its additional parameter to account for that. However, this normally applies to count data, not proportions.
The use of the Zero-Inflated Negative Binomial model in R is examplified here:
Fitting a zero-inflated negative binomial regression with R
The theory is explained here:
https://stats.idre.ucla.edu/r/dae/zinb/
Both dichotomic and time parts should initially consist of only the intercept term. You should then perform the Likelihood Ratio test to see if the Ship presence factor contributed to reduce the deviation significantly.
Confidence intervals for men and women separately. Let capital letters denote estimates: The point estimates for the proportions
of men and of woman who agree are $0.264$ and $0.323,$ respectively. The
corresponding Wald confidence intervals, based on normal approximation, are
$(0.225, 0.303)$ for men and $(0.288, 0.357)$ for women.
P.m = 132/500; P.m
[1] 0.264
SE.m = sqrt(P.m*(1-P.m)/500); SE.m
[1] 0.01971314
CI.m = P.m + qnorm(c(.025,.975))*SE.m
CI.m
[1] 0.225363 0.302637
P.w = 226/700; P.w
[1] 0.3228571
SE.w = sqrt(P.w*(1-P.w)/700); SE.w
[1] 0.01767243
CI.w = P.w + qnorm(c(.025,.975))*SE.w
CI.w
[1] 0.2882198 0.3574945
The procedure binom.test
in R gives (slightly different) exact binomial
confidence intervals, $(0.226, 0.305)$ for men and $(0.288, 0.359)$ for women, as shown below.
binom.test(132,500)$conf.int
[1] 0.2258560 0.3049604
attr(,"conf.level")
[1] 0.95
binom.test(226,700)$conf.int
[1] 0.2883144 0.3589013
attr(,"conf.level")
[1] 0.95
CI for the difference between men and women.
As in the last section of the table in your question and in @Ben's Answer (+1), the (estimated) standard error for the difference P_w - P_m
is as follows:
SE.d = sqrt(SE.w^2 + SE.m^2); SE.d
[1] 0.02647495
Then a 95% confidence interval for the difference, based on normal approximations,
is (0.0070, 0.1107),$ which is essentially the same
P.w-P.m + qnorm(c(.025,.975))*SE.d
[1] 0.006967198 0.110747087
In R the procedure prop.test
gives the same 95% confidence interval
$(0.0070, 0.1107).$ [This interval is also based on a normal approximation; the continuity correction was declined on account
of the large sample sizes.]
prop.test(c(226,132), c(700,500), cor=F)$conf.int
[1] 0.006967198 0.110747087
attr(,"conf.level")
[1] 0.95
Notice that this 95% confidence interval does not include $0.$
Accordingly, a test the women and men have equally favorable opinions
is rejected at the 5% level (against the two-sided alternative).
The P-value of the (approximate normal) test is $0.028 < 0.05 = 5\%.$
prop.test(c(226,132), c(700,500), cor=F)$p.val
[1] 0.02802182
Best Answer
It depends. If you are just expressing a number that can take any value as a percent (for example, relative growth of number of sales), then go ahead! -15% is just -0.15, 133% is just 1.33 and so on.
However, if what you're dealing with is a proportion (as in "73% of students passed the test, ie: a value between 0% and 100%), then you should calculate standard deviation from the binomial distribution, in other words, for a population of n and a proportion of p (measured from 0 to 1), your variance is np(1-p), and your standard deviation is the square root of that number