The expected value of the distance is the sum of the probabilities that the distance is greater than $d$ for $d=0,1,2, ...$. Unfortunately, it's hard to determine the exact value of this probability. Whether this is greater than $0$ determines whether a code of that size exists.
There are some easy bounds on the probabilities which translate to bounds on the expected minimum Hamming distance. Take any ordering of the points, and check whether each point is consistent with the previous points. As an upper bound, the balls of radius $\lfloor d/2 \rfloor$ about the previous points must be disjoint, and the current point must not be contained in these balls. As a lower bound, each previous choice rules out at most a ball of radius $d$ new points.
Let $B(L,d,q) = \sum_{i=0}^d {L \choose i}(q-1)^i$ be the size of the ball of radius $d$ with an alphabet of size $q$. Let $Pr(L,d,q,P)$ be the probability that $P$ random points are at distances strictly greater than $d$ from each other.
$$ \prod_{i=1}^{P-1} \max(0,1-\frac{iB(L,d,q)}{q^L}) \le Pr(L,d,q,P) \le \prod_{i=1}^{P-1} \max(0,1-\frac{iB(L,\lfloor d/2 \rfloor, q)}{q^L}).$$
The upper bound on the expected minimum Hamming distance from summing the probabilities is precisely twice the lower bound, so these bounds get you within a factor of $2$. Actually, that upper bound on the probabilities is so weak that a better bound in many situations is $Pr(L,d,q,P) \le (1-B(L,d,q)/q^L)^{P-1}$. You get that by saying that the second and subsequent points have to avoid the ball of radius $d$ about the first point.
For example, suppose you take $q=2, L=20, P=10.$ The lower bound on the expected minimum Hamming distance is $5.10246$. If you use the second upper bound when it is better, the upper bound on the expected minimum distance is $6.71209.$
One useful trick that comes in handy sometimes (I originally saw it in this paper of Talagrand, though it may go further back): We can view a random permutation in $S_n$ as being generated as follows: Start with the identity permutation, and successively perform transpositions of the form $(n, a_n)$, then $(n-1, a_{n-1})$, and so on down to $(2, a_2)$, where each $a_j$ is uniformly chosen from the numbers between $1$ and $j$ (with no transposition if $a_j=j$).
The point is that the $a_j$ are now independent. So if we have a situation where each individual $a_j$ has little impact on the sum of the $X_i$ or $Y_i$, then we can apply concentration bounds for independent variables. For example, changing an individual $a_j$ only impacts at most $3$ positions in the final permutation (whereever $j$, the old $a_j$, and the new $a_j$ end up), changing an individual $a_j$ can only change $X_1+\dots+X_n$ by $3$ (or at most $1$, if you're more careful), so McDiarmid's inequality immediately gives exponential concentration on the sum.
Best Answer
Let $u_k$ be the number of variables with value exactly $k$. If you pick a distribution of the $u_k$ such that $E[u_k]=1$, $E[u_k^2] = 2-1/n$, $E[u_ku_l] = 1-1/n$ for $k \neq l$ and $\sum_{k=1}^n u_k$ is always $1$, then by choosing a random $u_1, \dots ,u_n$ according to the distribution and then choosing a random ordering, you get a random set of variables with the desired independence property.
On the other hand if you do the same thing for $n+1$ pairwise independent random variables you get $E[u_k]=1+1/n$, $E[u_k^2] = 2+ 2/n$, $E[u_k u_l] = 1+1/n$. Here is a way to construct a distribution on $u_k$ that satisfies these conditions. To get a distribution for $n$ pairwise independent random variables, we just delete the last variable. The formula is:
With probability $(1+1/n) /k(k+1)$, we have $u_k=k+1$, $u_l =0$ for $l <k$, $u_l=1$ for $l>k$. To compute the three moments, we ignore a factor of $1+1/n$:
Mean:
$$ \frac{(k+1)}{k(k+1)} + \sum_{l < k} \frac{1}{l(l+1) } = \frac{1}{k} + 1- \frac{1}{k} = 1 $$
Mean of square:
$$ \frac{(k+1)^2}{k(k+1)} + \sum_{l < k} \frac{1}{l(l+1) } = \frac{k+1}{k} + 1- \frac{1}{k+1} = 2 $$
Mean of product is the same as mean, since if $l<k$ then $u_l=1$ whenever $u_k>0$.
Then put back in the factor of $(1+1/n)$ to get the right answer.
The expected value of the minimum of the $n+1$ variables at least $\sum_{k=1}^n (1+1/n) /(k+1) \approx \log n$. Removing a variable won't make the minimum any smaller.