Let $X_1, X_2, \dots, X_k$ each be random digits. That is, they are independent random variables each uniformly distributed over the finite set $\{0, 1, 2, 3, 4, 5, 6, 7, 8, 9\}$. Let $S = X_1 + X_2 + \dots + X_k$. Given some large integer $n$, what is the probability that $S = n$?
When $n$ or $k$ is small, the exact number can be computed as
$$[z^n]\left(\frac{1-z^{10}}{1-z}\right)^k = \frac{1}{10^k} \sum_{10r+s=n} (-1)^{r+s} \binom{k}{r} \binom{-k}{s} = \frac{1}{10^k} \sum_{r} (-1)^{r} \binom{k}{r} \binom{k+n-10r-1}{n-10r}$$
(see the long list of questions below to see derivations) but what I need is an asymptotic expression useful for large $n$ and $k$, and I don't know how to either derive one from this formula, or arrive at one independently.
(For now, I'm not worrying about the complication of insisting that $X_1$ be nonzero, though feel free to consider it if it actually helps.)
What I have tried, part 1: As the $X_i$s are IID random variables (with mean $\mu = 4.5$ and variance $\sigma^2 = 8.25$), the central limit theorem applies, so we expect $\Pr(S = n)$ to be highest for $n$ around $4.5k$, and the probability distribution of $S$ to be bell-curve-shaped around that value (and most of the probability will be distributed for $n$ about $O(\sqrt{k})$ from that value).
Trying to make this more precise, the central limit theorem gives us
$$\sqrt{k}(S/k – 4.5) \xrightarrow{d} N(0,8.25) \quad \text{i.e.} \quad \lim _{k\to \infty}\Pr \left[\sqrt{k}(S_{k}/k- 4.5)\le z\right]= \Phi\left(\frac {z}{\sqrt{8.25}}\right)$$
where $\Phi(x) = \frac12 \left[1+\operatorname{erf} \left(\frac{x}{\sqrt {2}}\right)\right]$ is the CDF of the standard normal distribution $N(0,1)$ (and erf is a special function) and therefore
$$\Pr(S \le x) = \Pr(S/\sqrt{k} – 4.5\sqrt{k} \le x/\sqrt{k} – 4.5\sqrt{k}) \to \Phi\left(\frac{x – 4.5k}{\sqrt{8.25k}}\right)$$
and so, applying a continuity correction,
$$\begin{align}\Pr(S = n) &\approx \Pr(n – 0.5 < S \le n + 0.5) \\
&\to \Phi\left(\frac{n + 0.5 – 4.5k}{\sqrt{8.25k}}\right) – \Phi\left(\frac{n – 0.5 – 4.5k}{\sqrt{8.25k}}\right) \\
&= \frac12\operatorname{erf}\left(\frac{2n+1-9k}{\sqrt{66k}}\right) – \frac12\operatorname{erf}\left(\frac{2n-1-9k}{\sqrt{66k}}\right) \\
&\overset{?}{\approx} \frac{2}{\sqrt{\pi}}\frac{1}{\sqrt{66k}}\exp\left(-\left(\frac{2n-9k}{\sqrt{66k}}\right)^2\right)
\end{align}$$
but I neither know whether this is correct and rigorous, nor what to do with this next.
What I have tried, part 2: There are many questions on this site about calculating this number exactly:
-
how many integers between one and 100000 have the sum equal to fifteen?
-
How many numbers between 100 and 900 have sum of their digits equal to 15?
-
Stars and bars to find "how many $x$ digit numbers are there with sum of digits $y$"?
-
How many natural numbers are there less than $90000$ that have the sum of digits equal to $8$?
-
How many numbers between $1$ and $9999$ have sum of their digits equal to $8$? $16$?
-
How many numbers between $0$ and $999,999$ are there whose digits sum to $r$
-
How many positive integers less than 1,000,000 have the sum of their digits equal to 19?
-
How many positive integers < 10^6 have sum of digits equal to 19
-
For how many integers from 1 through 99,999 is the sum of their digits equal to 10?
-
If I have an integer, how many numbers are there whose digits sum to the integer?
-
Find number of positive integers less than $10^8$ with digit sum of $24$
-
Find the number of positive integers whose digits add up to 42
The best that can be got from reading all of them is the exact formula mentioned at the top of the question (done with generating functions or inclusion-exclusion), not an asymptotic one. In particular I'd like to be able to get something which can be summed over $k$, to answer the following question:
Question 2: Let $X_{i,j}$ each be IID random digits as before, for $i = 1, 2, \dots$ and $j = 1, 2, \dots, i$. Let $S_k = X_{k,1} + X_{k,2} + \dots + X_{k,k}$ be the sum of $k$ random digits. So we have an infinite sequence of random variables (sums) $S_1, S_2, S_3, \dots$ each obtained by adding the digits of a random number of a different length. Given an integer $n$, what is the probability that some element of this sequence is exactly equal to $n$?
In other words, for each $k$ there is a probability distribution over $n$, and we want to know the total probability that falls on a single integer $n$. (Basically for each $n$ there will be some significant probability for $k$ around $n/4.5$ and the probability will fall off significantly for $k$ further away from this.)
(Again, feel free to add or remove the condition that $X_{k,1}$ is actually distributed over the set $\{1, 2, \dots, 9\}$, i.e. is nonzero.)
What I have tried, part 3: I tried to read Distribution of the sum-of-digits function of random integers: A survey (which I found by searching for some relevant terms), and many of the papers it references. But I got pretty lost trying to figure out what is true for base-$2$ versus base-$10$, and things like that. Perhaps the answer to my question is buried somewhere in there, but I'm not sure.
Best Answer
Here we show the main term $\frac{2}{\sqrt{\pi}}\frac{1}{\sqrt{66k}}$ in OPs asymptotic expansion is correct using the saddle-point method.
The coefficients in (2) form a sequence starting with $(1,10,670, 55\,252,4\,816\,030,\ldots)$. These numbers are called Lucky ticket numbers and are stored in OEIS. In fact they are the even central decanomial coefficients which are the largest coefficients in the expansion of (1).
We can prove the asymptotic expansion (3) using the saddle-point method. In order to do so we closely follow section VIII.8 Large Powers in Analytic Combinatorics by P. Flajolet and R. Sedgewick. We also give a little bit of surrounding information to ease readability.
In our situation (3) we have $A(z)=1$ and $B(z)=\left(1+z+z^2+\cdots+z^9\right)^2$ with $K=9k$. What follows are conditions which need to be fulfilled by $A(z)$ and $B(z)$ in order to apply the saddle-point method.
We observe conditions L1 to L3 are fulfilled for $A(z)=1$ and $B(z)=\left(1+z+z^2+\cdots+z^9\right)^2$ with radius of convergence $R=\infty$. In the following we need the quantity $T$ called spread which is defined as
\begin{align*} \color{blue}{T}&:=\lim_{z\to R^{-}}\frac{zB^{\prime}(z)}{B(z)}\\ &=\lim_{z\to \infty}\frac{2z\left(1+z+\cdots+z^9\right)\left(1+2z+\cdots+9z^8\right)}{\left(1+z+\cdots+z^9\right)^2}\\ &=\lim_{z\to\infty}\frac{2z\left(1+2z+\cdots+9z^8\right)}{1+z+\cdots+z^9}\\ &\,\,\color{blue}{=18} \end{align*}
The purpose is to analyse the coefficients \begin{align*} [z^K]A(z)\cdot B(z)^k \end{align*} when $K$ and $k$ are linearly related. In order to do so the condition $K<Tk$ will be imposed which is inherent in the nature of the problem. Note that in our case we have $K=9k$ and with $T=18$ the condition $K<Tk$ evaluates to $9k<18k$ which is valid.
We also need a quantity $\zeta$ which is introduced in Proposition VIII.7 and since this is a useful and easily derived upper bound for the coefficients we note
We start with calculating the roots of (4). We set $\lambda =9$ and obtain according to (4) \begin{align*} z\frac{B^{\prime}(z)}{B(z)}&=9\\ \end{align*} which gives with $B(z)=\left(1+z+\cdots+z^9\right)^2$: \begin{align*} \color{blue}{9z^9+7z^8+5z^7+3z^6+z^5-z^4-3z^3-5z^2-7z-9=0} \end{align*} from which we easily derive the positive root $\color{blue}{\zeta =1}$.
We find as upper bound according to (5) \begin{align*} [z^{9k}]\left(\frac{1-z^{10}}{1-z}\right)^{2k}\leq \left(\sum_{k=0}^91\right)^{2k}\cdot 1^{-9k}=10^{2k} \end{align*} This upper bound isn't really sharp but it may be useful whenever only rough order of magnitude estimates are sought.
Now we are well prepared for the main theorem.
Now it's time to harvest. At first we calculate $\xi$ according to (7). We obtain with $B(z)=\left(1+z+\cdots+z^9\right)^{2}$ and $\lambda=9$: \begin{align*} \color{blue}{\xi}&=\left.\left(\frac{d^2}{dz^2}\left(\log (B(z)-9\log z\right)\right)\right|_{z=1}\\ &=\left.\left(\frac{d}{dz}\left(\frac{B^{\prime}(z)}{B(z)}-\frac{9}{z}\right)\right)\right|_{z=1}\\ &=\frac{B^{\prime\prime}(1)}{B(1)}-\left(\frac{B^{\prime}(1)}{B(1)}\right)^2+9\\ &=\frac{177}{2}-81+9\\ &\,\,\color{blue}{=\frac{33}{2}}\tag{8} \end{align*}