If you have not only the frequencies but the actual counts, you can use a $\chi^2$ goodness-of-fit test for each data series. In particular, you wish to use the test for a discrete uniform distribution. This gives you a good test, which allows you to find out which data series are likely not to have been generated by a uniform distribution, but does not provide a measure of uniformity.
There are other possible approaches, such as computing the entropy of each series - the uniform distribution maximizes the entropy, so if the entropy is suspiciously low you would conclude that you probably don't have a uniform distribution. That works as a measure of uniformity in some sense.
Another suggestion would be to use a measure like the Kullback-Leibler divergence, which measures the similarity of two distributions.
First, note that your terminology is inconsistent. Here I take it that you have one variable (not several) consisting of a fixed number of categories and you are concerned with how categories with zero frequency or probability (not value) are handled.
Your $H$ is evidently $\sum p_i\ \text{log}_2\ (1/p_i)$ for probabilities or proportions $p_i$. The base used for logarithms does not affect any key principle here so we can think that we are summing terms $p_i\ \text{log}\ (1/p_i) = -p_i\ \text{log}\ p_i$.
The counter-argument to your worry is that entropy does take into account categories that have zero probability; it is just that they contribute zero to the entropy given that a strong convention that $-0\ \text{log}\ 0$ is evaluated as 0. A more informal version of the same argument is that the diversity or non-uniformity of what you do have in your collection is unaffected by what you don't have. If I have 10 elephants, spelling out that I have 0 giraffes or do not have any giraffes is incidental: what I have are 10 elephants. Any other statement about 0 frequencies adds no information (literally).
The same question of how to handle zero proportions arises with any measure. An alternative to entropy is based on squaring probabilities $\sum p_i^2$ and with such measures there is the same consequence that any $p_i$ that is 0 makes no difference to the sum.
You touch on a much more general issue of what can be inferred about a distribution from a summary measure. But any single summary measure is a irreversible reduction; you can't go back to the distribution unequivocally. This is on all fours with the point made in elementary statistics that a mean or correlation can reflect quite different data.
I suspect that the main issue here is that you are seeking a way to make entropy more intuitive and that is a legitimate concern. An easy way is to talk in terms of the "numbers equivalent". Calculate $2^H$ for your examples and you recover 5 for 10,10,10,10,10 and 1 for 10,0,0,0,0, which have the interpretation as the equivalent number of (equally common) categories that are present. For other examples, the result will be a non-integer, which is reasonable. For bases 10 or $e$, use $10^H$ or $\exp(H)$ to get the numbers equivalent.
P.S. I try to avoid asserting that something is meaningless unless I am totally sure that it is. I have found too often that I just didn't understand the argument.
EDIT 2016: If you know that (e.g.) 4 and only 4 categories are possible in principle, but only 3 occur, then that's pertinent information. Sometimes you know this: e.g. if cards can be $\{$spades, hearts, clubs, diamonds$\}$ and only some of those kinds occur, that's something to cite.
A measure of diversity that does take zeros into consideration, and is affected by whether zeros occur, has various names (e.g. dissimilarity index) and has general form $(1/2) \sum_{i=1}^S | p_i - q_i | =: D$ (say). Here $p_i$ is the observed proportion of category $i$ and $q_i$ is the proportion in a reference distribution, e.g. equal probabilities $q_i = 1/S$. Then the minimum occurs when the observed distribution is identical to the reference distribution and then $D = 0$. The maximum occurs when one proportion $p_i$ is $1$ and the others all zero. The achievable maximum depends on the number of categories $S$, which after all is part of the information. The concrete interpretation of $D$ is the minimum proportion that would need to change categories to reproduce the reference distribution.
Another example of a reference distribution would be the national distribution of different socio-economic classes or ethnic categories. Then $D = 0$ might mean that a local or regional community is a microcosm of the national and otherwise $D$ measures departure from that in some direction.
Best Answer
I think @John 's idea of a chi=square test is one way to go.
You would want patches on 2-d, but you would want to test them using a 1 way chi-square test; that is, the expected values for the cells would be $\frac{1000}{N}$ where N is the number of cells.
But it's possible that different number of cells would give different conclusions.
Another possibility is to compute the average distance between points and then compare this to simulated results of that average. That avoids the problem of an arbitrary number of cells.
EDIT (more on average distance)
With 1000 points, there are $\frac{1000*999}{2}$ pairwise distances between points. These can each be computed (using, say, Euclidean distance). These distances can be averaged.
Then you can generate N (a large number) of sets of 1000 points that are uniformly distributed. Each of those N sets also has an average distance among points.
Compare the results for the actual points to the simulated points, either to get a p-value or just to see where they fall.