The 95% is not numerically attached at all to how confident you are that you've covered the true effect in your experiment. Perhaps recognizing that "interval using 95% coverage range calculation" might be a more accurate name for it. You can make the choice to decide that the interval contains the true value; and you'll be right if you do that consistently 95% of the time. But you really don't know how likely it is for your particular experiment without more information.

**Q1:**
Your first query conflates two things and misuses a term. No wonder you're confused. A narrower confidence interval may be more precise but, when calculated the same way, such as the 95% method, they all have the same accuracy. They capture the true value the same proportion of the time.

Also, just because it's narrow doesn't mean you're less likely to encounter a sample that falls within that narrow confidence interval. A narrow confidence interval can be achieved one of three ways. The experimental method or nature of the data could just have very low variance. The confidence interval around the boiling point of tap water at sea level is pretty small, regardless of the sample size. The confidence interval around the average weight of people might be rather large because people are very variable but one can make that confidence interval smaller by just acquiring more observations. In that case, as you gain more certainty about where you believe the true value is, by collecting more samples and making a narrower confidence interval, then the probability of encountering an individual in that confidence interval does go down. (it goes down in any case when you increase sample size, but you may not bother collecting the big sample in the boiling water case). Finally, it could be narrow because your sample is unrepresentative. In that case you are actually more likely to have one of the 5% of intervals that does not contain the true value. It's a bit of a paradox regarding CI width and something you should check by knowing the literature and how variable this data typically is.

Further consider that the confidence interval is about trying to estimate the true mean value of the population. If you knew that spot on then you'd be even more precise (and accurate) and not even have a range of estimates. But your probability of encountering an observation with that exact same value would be far lower than finding one within any particular sample based CI.

**Q2**: A 99% confidence interval is wider than a 95%. Therefore, it's more likely that it will contain the true value. See the distinction above between precise and accurate, you're conflating the two. If I make a confidence interval narrower with lower variability and higher sample size it becomes more precise, the likely values cover a smaller range. If I increase the coverage by using a 99% calculation it becomes more accurate, the true value is more likely to be within the range.

You can use a confidence interval (CI) for hypothesis testing. In the typical case, if the CI for an effect does not span 0 then you can reject the null hypothesis. But a CI can be used for more, whereas reporting whether it has been passed is the limit of the usefulness of a test.

The reason you're recommended to use CI instead of just a t-test, for example, is because then you can do more than just test hypotheses. You can make a statement about the range of effects you believe to be likely (the ones in the CI). You can't do that with just a t-test. You can also use it to make statements about the null, which you can't do with a t-test. If the t-test doesn't reject the null then you just say that you can't reject the null, which isn't saying much. But if you have a narrow confidence interval around the null then you can suggest that the null, or a value close to it, is likely the true value and suggest the effect of the treatment, or independent variable, is too small to be meaningful (or that your experiment doesn't have enough power and precision to detect an effect important to you because the CI includes both that effect and 0).

**Added Later:**
I really should have said that, while you can use a CI like a test it isn't one. It's an estimate of a range where you think the parameter values lies. You can make test like inferences but you're just so much better off never talking about it that way.

Which is better?

**A)** The effect is 0.6, *t*(29) = 2.8, *p* < 0.05. This statistically significant effect is... (some discussion ensues about this statistical significance without any mention of or even strong ability to discuss the practical implication of the magnitude of the finding... under a Neyman-Pearson framework the magnitude of the *t* and *p* values is pretty much meaningless and all you can discuss is whether the effect is present or isn't found to be present. You can never really talk about there not actually being an effect based on the test.)

or

**B)** Using a 95% confidence interval I estimate the effect to be between 0.2 and 1.0. (some discussion ensues talking about the actual effect of interest, whether it's plausible values are ones that have any particular meaning and any use of the word significant for exactly what it's supposed to mean. In addition, the width of the CI can go directly to a discussion of whether this is a strong finding or whether you can only reach a more tentative conclusion)

If you took a basic statistics class you might initially gravitate toward A. And there may be some cases where it is a better way to report a result. But for most work B is by far and away superior. A range estimate is not a test.

## Best Answer

The p-value relates to a test against the null hypothesis, usually that the parameter value is zero (no relationship). The wider the confidence interval on a parameter estimate is, the closer one of its extreme points will be to zero, and a p-value of 0.05 means that the 95% confidence interval just touches zero. In fact for a p-value $p$ of a parameter estimate, the $(1-p)$ level confidence interval just touches zero.