This is only an answer to your first question.
How can they replace only one entry with q and say that this is entropy of q?
In the paper $h(q)$ is not computed this way. The inequality of Lemma 4.2 is used to prove that $h(p) \le log(n)$ and
$h(p) \lt log(n)$ if $p$ is not the uniform distribution with $p_1=p_2=\ldots p_n=\frac{1}{n}$
Lemma 4.2:
$$-\sum_{i=1}^{n}p_i \log{p_i} \le -\sum_{i=1}^{n}p_i \log{q_i} \tag{1} $$
Equality holds iff $$p_i=q_i, i=1,\ldots , n \tag{2}$$
$\square$
We know that the entropy is defined by
$$h(p)=-\sum_{i=1}^{n}p_i \log{p_i} \tag{3} $$
This can be used to reformulate the inequation of the Lemma as
$$ h(p)\le -\sum_{i=1}^{n}p_i \log{q_i} \tag{4} $$
This is valid for all discrete distributions so also for the uniform distribution with
$$q_i=\frac{1}{n} ,i=1,\ldots,n \tag{4a} $$
Substituting $\frac{1}{n}$ for $q_i$ gives
$$ h(p)\le \sum_{i=1}^{n}p_i \log{n} = (\log{n}) \cdot \sum_{i=1}^{n}p_i = \log{n} \tag{5} $$
But $log{(n)}$ is also $h(q)$, if $q$ is the uniform distribution. This can checked simply by using the definition of the entropy:
$$h(q)=-\sum_{i=1}^{n}q_i \log{q_i}=-\sum_{i=1}^{n}\frac{1}{n} \log{\frac{1}{n}} = \log{n} \sum_{i=1}^{n}\frac{1}{n} = \log{n} \tag{6} $$
So it follows that for the uniform distribution $q$
$$h(p) \le \log{n} = h(q) \tag{7} $$
Because of $(6)$ and $(2)$ equality holds exactly if $p$ is the uniform distribution too.
Edit:
Theorem 5.1 states, that the continous probability density on [a,b] with $\mu = \frac{a+b}{2}$ that maximizes entropy is uniform distribution $q(x)=\frac{1}{b-a}, x \in [a,b]$. This complies with the principle of indifference for coninous variable found here.
On the whole real line there is no uniform probability density. On the whole real line there is also no continous probability density with highest entropy, because there are continous probability densities with arbitrary high entropies, e.g. the gaussian distribution has entropy $\frac{1}{2}(1+\log(2 \pi \sigma^2))$: if we increase $\sigma$ the entropy increases.
Because there is no maximal entropy for continuous densities over $R$ we must have additional constraints, e.g. the constraint that $\sigma$ is fixed and that $\mu$ is fixed. The fact that there is a given finite $\sigma^2$ and $\mu$ for me makes intuitively clear that there values nearer to $\mu$ must have higher probability. If you don't fix $\mu$ then you will get no unique solution.The Gaussian distribution for each real $\mu$ is a solution: this is some kind of "uniformness", all $\mu$ can be used for a solution.
Notice that it is crucial to fix $\sigma$, $\mu$ and to demand $p(x)>0 , \forall x \in R$. If you fix other values or change the form $R$ to another domain for the density funtion , e.g. $R^+$, you will get other solution: the exponential distribution, the truncated exponential distribution, the laplace distribution, the lognorma distribution (Theorems 3.3, 5.1, 5.2, 5.3)
Use the normalized entropy:
$$H_n(p) = -\sum_i \frac{p_i \log_b p_i}{\log_b n}.$$
For a vector $p_i = \frac{1}{n}\ \ \forall \ \ i = 1,...,n$ and $n>1$, the Shannon entropy is maximized. Normalizing the entropy by $\log_b n$ gives $H_n(p) \in [0, 1]$. You will see that this is simply a change of base, so one may drop the normalization term and set $b = n$. You can read more about normalized entropy here and here.
Best Answer
You have to be more careful with what your outcomes are and what their probabilities are. From what I see you have 6 outcomes, let's call them $x_1,\dots,x_6$, with probabilities $p_1,\dots,p_6$ given in your list.
The outcomes can have cardinal values, e.g. throwing an (unfair) dice -> $x_1 = 1,\dots, x_6 = 6$. They can also be nominal, such as ethnicity -> $x_1 =$ black, $x_2 =$ caucasian etc.
In the first case, it makes sense to define mean and variance $$ \overline x = \sum_{i=1}^{6} p_ix_i, \qquad \mathbb V = \sum_{i=1}^{6} p_i (x_i-\overline x)^2. $$ The variance measures the (quadratic) spread around the mean. Note, that this definition is different from yours.
In the second case, mean and variance do not make any sense, since you cannot add black to caucasian or scale them, square them etc.
The entropy, on the other hand, can be defined in both cases! Intuitively, it measures the uncertainty of the outcome.
Note that, as Mike Hawk pointed out, it does not care what the outcomes actually are. They can be $x_1 = 1,\dots, x_6 = 6$ or $x_1 = 100,\dots, x_6 = 600$ or ($x_1 =$ black, $x_2 =$ caucasian etc.), the result will only depend on the probabilities $p_1,\dots,p_6$. The variance on the other hand will be very different for the first two cases (by the factor of 10000) and not exist in the third case.
Your definition of variance is very unconventional, it measures the spread of the actual probability values instead of the outcomes. I think that theoretically this can be made sense of, but I very much doubt that this is the quantity you wish to consider (especially as a medical doctor).
It is definitely not meaningful to compare it to entropy, which measures the uncertainty of the outcome. The entropy is maximal if all outcomes have equal probability $1/6$, whereas this would yield the minimal value 0 for your definition of variance...
Hope this helps.