Sorry, disregard what is below. The LIL gives $\max_{i \le N} |x_i| \approx \sqrt{2 N \log \log N}$ for infinitely many $N$, but for any particular $N$,
$\max_{i \le N} |x_i|$ should be of the order $\sqrt N$.
If you only care about bounds up to a constant factor, then I think you're after the law of the iterated logarithm (LIL). As Cardinal indicated, it's enough to consider the 1-dimensional problem (if you don't care about losing a factor of 2). Moreover,
$$
\max_i|x_i| \le \max_{i,j} |x_i - x_j| \le 2\max_i |x_i|
$$
and so you may as well consider $\max |x_i|$ instead. By the LIL, $\max_{i \le N} |x_i| \sim \sqrt{2 N \log \log N}$ almost surely.
The same argument works if the steps are distributed on the unit circle, since the LIL doesn't require Gaussian variables.
If you want to try to get the sharp constant, there are also multi-dimensional versions of the LIL available. You can search for them on Google; I don't really know that area...
To answer your questions about median and mode, one can take Alexandre's answer a little further and compute the exact distribution function for the overtake-times.
Note that the overtake-time doesn't depend on $v_1,v_2$ directly, but only on their difference. Call the difference $v$. Now $v$ is the difference of two uniformly distributed random variables on $[0,1]$, so it is supported on $[-1,1]$ with probability density function $1-|v|$. Moreover, since $\theta$ is uniformly distributed we can without loss of generality identify the cases $(v,\theta)$ and $(-v,1-\theta)$ and reduce everything to the following set-up:
- $v$ is distributed on $[0,1]$ with density function $2(1-v)$.
- $\theta$ is uniformly distributed on $[0,1]$.
- The overtake-time is $t=\theta/v$.
Now we can compute the cumulative density function for the overtake-time. Indeed, we have $P(t<T) = P(\theta/v<T) = P(\theta < Tv)$, which we can get by the following integral:
$$
P(t<T) = \int_0^1 2(1-v) P(\theta < Tv | v) \,dv.
$$
The probability $P(\theta < Tv | v)$ is given by the function $f(\theta,v) = \max(Tv,1)$. Thus for $T\leq 1$, we have $f(\theta,v)=Tv$ for all $v\in[0,1]$, so integrating gives $P(t<T) = T/3$, while for $T\geq 1$, we integrate and find
$$
P(t<T) = \int_0^{1/T} 2(1-v)Tv\,dv + \int_{1/T}^1 2(1-v)\,dv = 1-\frac 1T + \frac 1{3T^2}.
$$
So in the end the cumulative density function for the overtake-time is
$$
P(t<T) = \begin{cases}
\frac T3 & T\leq 1, \\
1 - \frac 1T + \frac 1{3T^2} & T \geq 1.
\end{cases}
$$
The term $1/T$ in the last expression will give you the infinite mean, since upon differentiating the CDF you'll get a term $1/T^2$, which upon multiplying by $T$ and integrating to get the mean you end up integrating $1/T$ from $1$ to $\infty$.
As for the median, it looks as though any proximity to $\pi/2$ is just a red herring, because solving for $P(t<T) = 1/2$ yields $T=1 + \frac 1{\sqrt{3}} \approx 1.57735\dots$.
Best Answer
This is mainly for reference, I have not found a simple answer outside of the diffusion approximation.
Assume $\alpha\gg 1$. The random walk has unit step size and the angle $\phi$ of a step with the $x$-axis is given by $\phi=\arctan(h/k)$ with $h,k$ two independent real numbers that are uniformly distributed in the range $(1,\alpha)\cup(-1,-α)$. The corresponding probability distribution $P(\phi)$ in the first quadrant $(0<\phi<\pi/2)$ is given by
$$P(\phi)=\frac{1}{2(\alpha-1)^2}\times\begin{cases}\alpha^2/\sin^2\phi-1/\cos^2\phi&{\rm if}\;\;1<\tan\phi<\alpha\\ \alpha^2/\cos^2\phi-1/\sin^2\phi&{\rm if}\;\;1/\alpha<\tan\phi<1\end{cases}$$
Here is a plot of $P(\phi)$ versus $\phi$ in the first quadrant for $\alpha=10$. (It is repeated identically in the other three quadrants.)
As expected the steps have the largest probability at a 45 degree angle with the $x$ and $y$ axes.
So to find the probability distribution $P(x,y,t)$ after $N=t/\Delta t$ steps we seek the probability distribution of the complex number $z=x+iy$ given by
$z=\sum_{n=1}^{N}\exp(i\phi_n)$ with $\phi_n$, $n=1,2,\ldots N$ drawn independently from the distribution $P(\phi)$. Even for an isotropic distribution of $P(\phi)=$ constant, this has no simple answer (except for the large time limit, when the diffusion approximation holds and $P(x,y,t)$ is just a Gaussian).