(Empirical) Bayesian approach: As @whuber mentions, I think the most natural approach is a Bayesian approach, or possibly even an empirical Bayesian approach.
In particular, let's call the true entity scores $e_j$ for $j=1,\dotsc,m$. To estimate these you have data $X_{j1},\dotsc, X_{jn_j}$ for each $j$. Note that $n_j$ is different in each case.
Now assume you have a prior $g$ for the $e_j$, i.e. $e_j \sim g$, then rather than estimating $e_j$ by the sample mean $\hat{e_j} = \frac{\sum_{i=1}^{n_j}X_{ij}} {n_j}$, you could take the posterior mean:
$$\tilde{e_j} = \mathbb E[e_j | X_{j1}, \dotsc, X_{jn_j}]$$
This approach would even work if you don't want to put a prior $g$ on these scores; instead you could learn the prior $g$ from your data. This is called Empirical Bayes. Bradley Efron recently wrote a paper on how to do this "almost" nonparametrically ("almost" because he does not do actual nonparametrics, but considers flexible exponential families with many parameters). For a simpler approach, David Robinson has a very nice blog post, where he elaborates on this idea based on the example on determining the best batters (i.e. rank based on average hits the players gets).
Ad-hoc frequentist approach: Another approach which in my opinion is a lot more ad-hoc, but has the advantage of being simpler and possibly easier to explain, is to use the lower bound of a ($1-\alpha$) confidence interval of the parameter for the ranking of these entity scores. For example, if $\hat{\sigma}_j$ is the standard deviation estimate based on the $n_j$ samples for the $j$-th entity score, then you could rank based on:
$$ \bar{e_j} = \hat{e_j} - \frac{\hat{\sigma}_j}{\sqrt{n_j}}z_{1-\frac{\alpha}{2}}$$
Here $z_{1-\frac{\alpha}{2}}$ is the $1-\frac{\alpha}{2}$ quantile of a standard Normal distribution (could use more elaborate schemes of course). This has the advantage of accounting for sample size directly and being simple to compute; but again I think it is quite ad-hoc. [There was a quite famous blog post by someone using such an approach for his internet company; I cannot find it right now unfortunately. Maybe someone that reads this can also point me to that post.]
Some intuitive re-expression of the problem:
You could regard the quantile function of the standardized squared difference $\chi = \left(\frac{|X-\mu|}{\sigma}\right)^2$ from the mean
Let
$$F(\chi) = P\left(\left(\frac{|X-\mu|}{\sigma}\right)^2 < \chi \right)$$
Then the quantile function that we speak about is the inverse
$$Q(p) = \lbrace \chi: F(\chi) = p \rbrace$$
This needs to be a monotonically increasing function that integrates to 1.
An example of this function for the normal distribution is:
Taken from here: https://math.stackexchange.com/a/3781761/466748
From this view the kurtosis is equal to
$$ kurtosis = \int_0^1 Q(p)^2 dp$$
And your concept is the point $p$ where $Q(p) = 1$ or differently
$$ Z = \int_0^1 \mathbb{1}_{Q(p)\leq 1} dp$$
Where $\mathbb{1}$ is the indicator function.
Your measure is computing how often standardized values are close to 0, or spread out and away from 1, In some way similar as kurtosis, but kurtosis is assigning more weight to extreme values.
Your measure is similar to kurtosis (from Greek for bulging). But to come up with a term might be difficult since many different shapes can correspond to high/low values of your statistic. Like kurtosis it has a similar relationships with peakedness, but it is also not exactly the same as peakedness and only correlates with it.
Maybe you shouldn't try to condense this in a particular name and you could describe it with a few more words. Because of the binary nature, how you count values below and above $1\sigma$ as either 0 or 1, you might call this measure 'the degree of probability division around $1\sigma$' or (my favorite) 'bulge/tail ratio', a measure for how much probability mass is in the tails and how much in the bulge. Firebug's suggestion in the comments is also nice 'probability concentration'.
Distributions that have a high $Z$ will have most of the distribution being determined by a central part with the tails possibly having large influence on the kurtosis or other values, but in terms of amount of probability the tails will be small.
Best Answer
Answer edited 9/15/2021:
In his answer to the OP, @whuber claims as follows:
For a distribution with kurtosis $\kappa$, the total density within one SD of the mean lies between $1−1/\kappa$ and $1$, where $\kappa$ is the (non-excess) kurtosis of the distribution.
THIS CLAIM IS FALSE.
The following example shows clearly that @whuber's result is false.
Consider my "Counterexample #1" from here, with $\theta = .001.$ In that counterexample, the kurtosis is $25.5,$ the range $1-1/\kappa$ to $1.0$ is from $0.96$ to $1.0,$ yet the probability within a standard deviation of the mean is $0.5$. These statements are verified by the R code:
Here is a graph of the counterexample distribution. The dashed vertical lines mark the $\mu \pm \sigma$ limits, within which it is clearly visible that there is only $0.50$ probability.
You can also illustrate the counterexample using a reproducible data set and summary statistics. The following R code generates $1000000$ samples from the counterexample distribution, a large enough sample size so that the "bias corrections" are negligible. The estimated kurtosis is $26.02$, the range $(1 - 1/26.02, 1)$, within which the central probability is supposed to lie, is $(.96,1)$, yet the estimated central probability is $0.4999$.
It is amusing to see just how spectacularly @whuber's result does fail. In my counterexample #1 family of distributions, the kurtosis can tend to infinity, implying, according to @whuber's "result," that the central probability approaches $1.0$. But instead, the central probability stays constant at $0.5$!
One does not need to construct fancy counterexamples to illustrate such spectacular failure of @whuber's claim. Consider the common $T_\nu$ distribution, the Student T distribution with degrees of freedom parameter $\nu$. For $\nu > 4$, its mean is zero, its variance is $\sigma^2 = \nu/(\nu -2)$, and its (non-excess) kurtosis is $\kappa = 6/(\nu-4) +3$. In the range $4 < \nu \le 5$, the kurtosis ranges from $9$ to $\infty$, while the probability within $\pm \sigma$ can be calculated numerically, in R notation, as
The following R code and resulting graph shows the range claimed by @whuber (dashed black lines), along with the actual central probability (solid red line).
Again, there is a spectacular failure of @whuber's claim, in that the claim implies the central probability must be essentially $1.0$ (for $\nu \approx 4$), when in fact it is far less (around $0.77$).
Thus, @whuber's claim is false: The central probability need not lie in @whuber's stated range. In fact, as my Counterexample #1 shows, the central probability need not increase at all with larger kurtosis.
Here are two results that shed additional light on the relation of kurtosis to the center.
Theorem 1. Consider a random variable $X$ (includes data via the empirical distribution) that has, wlog, mean = 0, variance = 1, and finite fourth moment. Now, create a new random variable $X'$ by replacing the mass/density of $p_X$ within $0 \pm 1$ arbitrarily, but maintaining $E(X')=0$ and $Var(X')=1.$ Then the difference between the maximum and minimum kurtosis statistics over all such replacements is less than 0.25.
Theorem 2. Consider a random variable $X$ as in Theorem 1. Now, create a new random variable $X'$ by replacing the mass/density of $p_X$ outside of $0 \pm 1$ arbitrarily, but maintaining $E(X')=0$ and $Var(X')=1$ in such replacements. Then the difference between the maximum and minimum kurtosis statistics over all such replacements is unbounded (i.e., infinite).
Thus, the effect of moving mass near the center has at most a very small effect on kurtosis, while the effect of moving mass in the tails has an infinite effect.
While one is trying to prove a theorem that proves that the center somehow is related to kurtosis, it is very helpful to know in advance what counterexamples may exist to such a theorem.
Good counterexamples are given here.
"Counterexample #1" shows a family of distributions in which the kurtosis increases to infinity, while the mass inside $\mu \pm \sigma$ stays a constant 0.5.
"Counterexample #2" shows a family of distributions where the mass within $\mu \pm \sigma$ increases to 1.0, yet the kurtosis decreases to its minimum.
So the often-made assertion that kurtosis measures “concentration of mass in the center” is obviously wrong.
Many people think that higher kurtosis implies “more probability in the tails.” This is not true either: Counterexample #1 shows that you can have higher kurtosis with less tail probability when the tails extend.
Instead, kurtosis precisely measures tail leverage. See
How the kurtosis value can determine the unhealthy event
and
In comparison with a standard gaussian random variable, does a distribution with heavy tails have higher kurtosis? .