Here is a formal proof for general Dirichlet distributions $(\alpha_1, \dots, \alpha_m)$.
I use capital $P_i$ to indicate that we are working with random variables.
$$-E(\sum_i P_i \log P_i)=-\sum_i E(P_i \log P_i)$$
then $P_i \sim Beta (\alpha_i, A -\alpha_i)$ and working with the normalizing constant you can write
$$
-E_{\alpha_i, A-\alpha_i}(P_i \log P_i)= \frac{\alpha_i}{A}E_{\alpha_i+1, A-\alpha_i}(\log P_i) = \frac{\alpha_i}{A} [\psi_0 (A+1)-\psi_0(\alpha_i+1)]
$$
where the last step arises by a known result (see Wikipedia page on Beta distributions): if $X \sim Beta(\alpha, \beta)$ then $-E(\log P_i)= \psi_0(\alpha+\beta)-\psi_0(\alpha)$.
Summing over $i$ provides your general formula.
You have to be more careful with what your outcomes are and what their probabilities are. From what I see you have 6 outcomes, let's call them $x_1,\dots,x_6$, with probabilities $p_1,\dots,p_6$ given in your list.
The outcomes can have cardinal values, e.g. throwing an (unfair) dice -> $x_1 = 1,\dots, x_6 = 6$.
They can also be nominal, such as ethnicity -> $x_1 =$ black, $x_2 =$ caucasian etc.
In the first case, it makes sense to define mean and variance
$$
\overline x = \sum_{i=1}^{6} p_ix_i,
\qquad
\mathbb V = \sum_{i=1}^{6} p_i (x_i-\overline x)^2.
$$
The variance measures the (quadratic) spread around the mean.
Note, that this definition is different from yours.
In the second case, mean and variance do not make any sense, since you cannot add black to caucasian or scale them, square them etc.
The entropy, on the other hand, can be defined in both cases! Intuitively, it measures the uncertainty of the outcome.
Note that, as Mike Hawk pointed out, it does not care what the outcomes actually are. They can be $x_1 = 1,\dots, x_6 = 6$ or $x_1 = 100,\dots, x_6 = 600$ or ($x_1 =$ black, $x_2 =$ caucasian etc.), the result will only depend on the probabilities $p_1,\dots,p_6$. The variance on the other hand will be very different for the first two cases (by the factor of 10000) and not exist in the third case.
Your definition of variance is very unconventional, it measures the spread of the actual probability values instead of the outcomes. I think that theoretically this can be made sense of, but I very much doubt that this is the quantity you wish to consider (especially as a medical doctor).
It is definitely not meaningful to compare it to entropy, which measures the uncertainty of the outcome. The entropy is maximal if all outcomes have equal probability $1/6$, whereas this would yield the minimal value 0 for your definition of variance...
Hope this helps.
Best Answer
Use the normalized entropy:
$$H_n(p) = -\sum_i \frac{p_i \log_b p_i}{\log_b n}.$$
For a vector $p_i = \frac{1}{n}\ \ \forall \ \ i = 1,...,n$ and $n>1$, the Shannon entropy is maximized. Normalizing the entropy by $\log_b n$ gives $H_n(p) \in [0, 1]$. You will see that this is simply a change of base, so one may drop the normalization term and set $b = n$. You can read more about normalized entropy here and here.