Edit May 9 — high-level summary of the issue here. $R$ gives a good proxy for estimating collision time, with a slight undercount. Random matrices and graphs give distributions with longer time until collision than what you'd expect by looking at their values of $R$
Suppose I do IID draws from multinomial distribution with probabilities $p_1,\ldots,p_n$. Is there a nice approximation for the $x(p)$, the expected number of draws from $p$ until collision, ie, drawing some $i$ more than once?
In the case of $p_1=p_2=\dots=p_n$, this was shown to be the following
$$x(p)=\sqrt{\frac{\pi n}{2}}$$
From simulations, the following appears to be a lower bound
$$x(p)\approx \sqrt{\frac{\pi R}{2}}$$
where $R=\frac{1}{\|p\|^2}$ and $\|p\|^2=p_1^2+\ldots+p_n^2$ is the probability of observing a collision in two IID draws from $p$.
Distributions were taken to be of the form $p_i\propto i^{-c}$ with $c$ varying
Best Answer
An approximation
Let $X_i$ ($i=1,2 \cdots $) be the result of $i-$th draw, let $C_{i,j}\equiv X_i = X_j$ ($i<j)$ be the colission event for the pair $(i,j)$. Let the event $D_k = \cap_{i<j\le k} C_{i,j}$ correspond to having all first $k$ draws different (no collisions); let $d_k = P(D_k)$. Clearly, $d_1=1$ , $d_{n+1}=0$.
Let $F_k$ be the event of having our first collision at draw $k$, and let $f_k=P(F_k)$
Then $$\begin{align} f_k &=\sum_{i=1}^{k-1} P(X_k = X_i\cap D_{k-1}) \\ &= P(D_{k-1})\sum_{i=1}^{k-1} P(X_k = X_i | D_{k-1}) \\ &= d_{k-1} \, (k-1) P(X_k = X_1 | D_{k-1}) \\ \tag{1} & \approx d_{k-1} \, (k-1) P(X_k = X_1 ) \\ &= d_{k-1} \, (k-1) \, \alpha \\ \end{align}$$
where $\alpha = \sum_{m=1}^n p_m^2$. We have $f_1=0$ and $f_2 = \alpha$.
Also: $$\begin{align} d_k &= d_{k-1} -f_k \\ \tag{2} \end{align}$$
This system is solved by
$$ d_k = \prod_{j=1}^{k-1} (1- j \alpha) \\ \tag{3}$$
(Actually, we need to truncate the product so that $d_k \ge 0$)
Finally, the expected time for next colission would be $$x(p) = 2 + \sum_{k=2}^n d_k = 2 + \sum_{k=2}^n \prod_{j=1}^{\min(k-1,1/\alpha)} (1- j \alpha)$$
Update: This approach does not seem promising. Actually $P(X_k = X_1 | D_{k-1}) \approx \alpha$ is only valid near $k=2$ ; at the other extreme $P(X_{n+1} = X_1 | D_{n}) = 1/n$.
Alternatively, one might resort to Newton's identities, noticing that our $d_k$ correspond to $ k! e_k$ in that page. This allows to compute $d_k$ (and hence $x(p)$) numerically, from $\alpha$ and the other power sums (which correspond to $p_k$ there) - as in JimB's answer. But, again, this does not seem to provide much insight about the asymptotics or approximations.
Added: another way to get the numerical exact solution, probably useless (numerically unstable), using Poissonization and numerical depoissonization.
Matlab/Octave code:
This gives $x(p) = 4.6333757$