First, the standard deviation is not the average distance to the mean, that is always zero. It is however, a value to measure how far the points are from the mean or not. Assuming the values are normally distributed, we know that ~68% of the values are between $\mu-\sigma$ and $\mu+\sigma$, for example.
Suppose we weigh potatoes with average weight 100 g and stadard deviation 5 g. What does hold for the average of the average weight of a group of 4 potatoes?
I hope you see that the average of the average weight is still 100 g. But what is the standard deviation of this average weight? That is where you use the formula
$$\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{5}{\sqrt{4}} =2.5$$
Feel free to ask if you still don't understand.
Proof that the average distance between the actual data and the mean is
$0$:
$$\frac{\sum^n_{i=1} (x_i-\mu)}{n} = \frac{(\sum^n_{i=1} x_i)-\mu n}{n} = \frac{\sum^n_{i=1} x_i}{n}-\mu = \mu - \mu = 0$$
While the article you refer to correctly defines the concept of confidence interval (your highlighted text) it does not correctly treat the case of a normal distribution with unknown standard deviation. You may want to search "Neyman confidence interval" to see an approach that produces confidence intervals with the property you highlighted.
The Neyman procedure selects a region containing 95% of outcomes, for each true value of the parameter of interest. The confidence interval is the union of all parameter values for which the observation is within the selected region. The probability for the observation to be within the selected region for the true parameter value is 95%, and only for those observations, will the confidence interval contain the true value. Therefore the procedure guarantees the property you highlight.
If the standard deviation is known and not a function of the mean, the Neyman central confidence intervals turn out to be identical to those described in the article.
Thank you for the link to Neyman's book - interesting to read from the original source! You ask for a simple description, but that is what my second paragraph was meant to be. Perhaps a few examples will help illustrate: Example 1 and 1b could be considered trivial, whereas 2 would not be handled correctly by the article you refer to.
Example 1. Uniform random variable. Let X follow a uniform distribution,
$$f(x)=1/2 {\mathrm{\ \ for\ \ }}\theta-1\le x\le \theta+1 $$ and zero otherwise.
We can make a 100% confidence interval for $\theta$ by considering all possible outcomes $x$, given $\theta$, ie. $x \in [\theta-1,\theta+1]$. Now consider an observed value, $x_0$. The union of all possible values of $\theta$ for which $x_0$ is a possible outcome is $[x_0-1,x_0+1]$. That is the 100% confidence interval for $\theta$ for this problem.
Example 1b. Uniform random variable. Let X follow the same uniform distribution. We can make a 95% central confidence interval for $\theta$ by selecting the 95% central outcomes $x$, given $\theta$, ie. $x \in [\theta-0.95,\theta+0.95]$. Now consider an observed value, $x_0$. The union of all possible values of $\theta$ for which $x_0$ is within the selected range is $[x_0-0.95,x_0+0.95]$. That is the 95% confidence interval for $\theta$ for this problem.
Example 2. Uniform random variable. Let X follow a uniform distribution,
$$f(x)=1/\theta {\mathrm{\ \ for\ \ }}{1\over2}\theta \le x \le {3\over2}\theta $$ and zero otherwise. We can make a 100% confidence interval for $\theta$ by considering all possible outcomes $x$, given $\theta$, ie. $x \in [{1\over2}\theta,{3\over2}\theta]$. Now consider an observed value, $x_0$. The union of all possible values of $\theta$ for which $x_0$ is a possible outcome is $[{2\over3}x_0,2x_0]$. That is the 100% confidence interval for $\theta$ for this problem. (You can confirm this by inserting the endpoints of the confidence interval into the pdf and see they are at the boundaries of the pdf). Note that the central confidence interval is not centered on the point estimate for $\theta$, $\hat\theta = x_0$.
Example 3. Normal distribution with mean $\theta$ and standard deviation $1$. The 68% central confidence interval would be constructed identically to example 1, that is the selected region for $X$ would be $[\theta-1,\theta+1]$. The 68% central confidence interval is therefore the same as in Example 1, $[x_0-1,x_0+1]$. You can extend this to 95% and arbitrary KNOWN standard deviation $\sigma$ to be $[x_0-1.96\sigma,x_0+1.96\sigma]$.
Example 4. Normal distribution with mean $\theta$ and standard deviation $\theta/2$. The 68% central confidence interval would be constructed identically to example 2. The 68% central confidence interval for $\theta$ is therefore the same as in Example 2, $[{2\over3}x_0,2x_0]$.
The authors of the article you refer to and the other commenters to your question would not get Example 2 or 4 right. Only following a procedure like Neyman's will the confidence interval have the property that you highlighted in your post. The other methods are approximations for the general problem of building confidence intervals.
The exact solution to the problem with a normal distribution and UNKNOWN standard deviation is more difficult to work out than the examples above.
Best Answer
For testing two proportions, we will use the two proportion hypothesis test Since you have three sets of data, three wild and three mutant, you can do three different tests of the same type to confirm your results. Your hypotheses would then be: $$H_0:p_{wild}-p_{mutant}=0$$ and $$H_1:p_{wild}-p_{mutant}\neq0.$$
You could also have $H_1:p_{wild}\gt{p_{mutant}}$. This will just change our $z_{\frac{\alpha}{2}}$ for $H_1:p_{wild}-p_{mutant}\neq0$ to $z_{\alpha}$ for $H_1:p_{wild}\gt{p_{mutant}}.$
You will then need the following information:
1.) $n_{wild}$ and $n_{mutant}$, the number of times you did each experiment.
2.) $\widehat{p_{wild}}$ and $\widehat{p_{mutant}}$. These are your experimental proportions from the data you collected. For example, for the first set, $\widehat{p_{wild}}=.65$.
3.) Finally, you need a $\hat{p}$. Experimentally, this is the number of successes from your wild group ($Y_{wild}$)and your mutant group ($Y_{mutant}$) with success being replication divided by $n_1+n_2$. (Y is capitalized because it is a random variable that changes experiment to experiment. We denote these random variables in capital letters. If you have an actual number of successes in a given trial it would be denoted $y_{wild}$ for example).
So that is, $$\hat{p}=\frac{Y_{wild}+Y_{mutant}}{n_{wild}+n_{mutant}}$$ You could just assume 100 trials and 63 successes, for example if you don't have that data, it would work. You would definitely prefer NEVER to do this though, as you don't want to mess with the integrity of your study. Having those exact figures should be your goal.
Now, the reason this works is that $$Z=\frac{\widehat{p_{wild}}-\widehat{p_{mutant}}}{\sqrt{{\hat{p}}(1-\hat{p})(\frac{1}{n_{wild}}+\frac{1}{n_{mutant}})}}\sim{N(0,1)}.$$ You already know your $z_{\alpha}$ (or $z_{\frac{\alpha}{2}}$, depending on which alternative hypothesis you use), so you compare your z-statistic with in this case 1.96. if your value computed for $z\ge{z_{\alpha}}$, you reject the null hypothesis that the proportions are equal.
Using your information on the first trial (assuming 100 experiments from each for this simulation), we have: $$n_{wild}=n_{mutant}=100, p_{wild}=.65, p_{mutant}=.5, \hat{p}=.575$$ So $$z=\frac{.65-.5}{\sqrt{(.575)(.425)(\frac{2}{100})}}\approx\frac{.15}{.0699}\approx2.1456\ge1.96$$ Now we can reject your hypothesis that mutated viruses replicate as efficiently as wild viruses. You could then repeat this experiment for the next two data sets and confirm this or not confirm it. I'll leave the rest to you then.
Finally, your question was in regards to the p-value. As I stated in the comments, the p-value is simply $$p-value=P(Z\ge{z})$$ In your case, the p-value is $P(Z\ge{2.146})=.016\lt.05=\alpha.$ This has the same effect as what we did above. If your p-value is less than your chosen $\alpha$ which in this case is (1-.95), you reject the null. If it is greater than your $\alpha$, you fail to reject.
On a side note, you should be aware that a failure to reject the null hypothesis does NOT mean that the alternative hypothesis is true. This basically means that you do not have enough evidence to give you reason to reject. This could be because you didn't do enough experiment. If you look at the z-statistic we calculate. If our values of $n_{wild}$ and $n_{mutant}$ are small, say 50 instead of 100, we would have to fail to reject the null hypothesis because we would have a z-statistic of $1.517\le1.96$. Now you can pretty much see there is a difference in the two data sets you have, but without enough empirical data through experimentation you couldn't reject your claim.