Starting from
$$\hat{\beta} = \arg \min_\beta \|X\beta - y\|_2^2 \text{ s.t. } (1-\alpha)\|\beta\|_1 + \alpha\|\beta\|_2^2 \leq t,$$
we can write the dual Lagragian formulation of this optimization problem as
$$ \begin{array}{rcl}
L(\beta,\alpha,\lambda) & = & \|X\beta - y\|_2^2 + \lambda \left( (1-\alpha)\|\beta\|_1 + \alpha\|\beta\|_2^2 - t\right) \\
& = & \|X\beta - y\|_2^2 + \lambda (1-\alpha)\|\beta\|_1 + \lambda\alpha\|\beta\|_2^2 - \lambda t,
\end{array}
$$
and we see that this indeed looks like the first problem that you wrote, with parameters $\lambda_1=\lambda (1-\alpha)$ and $\lambda_2=\lambda \alpha$, which leads to the expression of the "elastic" parameter:
$$\alpha = \frac{\lambda_2}{\lambda_1+\lambda_2}.$$
That being said, to go from this point to Zou and Hastie's assertion that both problems are equivalent, I admit that I miss a step or two...
They are indeed equivalent since you can always rescale $\lambda$ (see also @whuber's comment). From a theoretical perspective, it is a matter of convenience but as far as I know it is not necessary. From a computational perspective, I actually find the $1/(2n)$ quite annoying, so I usually use the first formulation if I am designing an algorithm that uses regularization.
A little backstory: When I first started learning about penalized methods, I got annoyed carrying the $1/(2n)$ around everywhere in my work so I preferred to ignore it -- it even simplified some of my calculations. At that time my work was mainly computational. More recently I have been doing theoretical work, and I have found the $1/(2n)$ indispensable (even vs., say, $1/n$).
More details: When you try to analyze the behaviour of the Lasso as function of the sample size $n$, you frequently have to deal with sums of iid random variables, and in practice it is generally more convenient to analyze such sums after normalizing by $n$--think law of large numbers / central limit theorem (or if you want to get fancy, concentration of measure and empirical process theory). If you don't have the $1/n$ term in front of the loss, you ultimately end up rescaling something at the end of the analysis so it's generally nicer to have it there to start with. The $1/2$ is convenient because it cancels out some annoying factors of $2$ in the analysis (e.g. when you take the derivative of the squared loss term).
Another way to think of this is that when doing theory, we are generally interested in the behaviour of solutions as $n$ increases -- that is, $n$ is not some fixed quantity. In practice, when we run the Lasso on some fixed dataset, $n$ is indeed fixed from the perspective of the algorithm / computations. So having the extra normalizing factor out front isn't all that helpful.
These may seem like annoying matters of convenience, but after spend enough time manipulating these kinds of inequalities, I've learned to love the $1/(2n)$.
Best Answer
The two formulations are equivalent in the sense that for every value of $t$ in the first formulation, there exists a value of $\lambda$ for the second formulation such that the two formulations have the same minimizer $\beta$.
Here's the justification:
Consider the lasso formulation: $$f(\beta)=\frac{1}{2}||Y - X\beta||_2^2 + \lambda ||\beta||_1$$ Let the minimizer be $\beta^*$ and let $b=||\beta^*||_1$. My claim is that if you set $t=b$ in the first formulation, then the solution of the first formulation will also be $\beta^*$. Here's the proof:
Consider the first formulation $$\min \frac{1}{2}||Y - X\beta||_2^2 \text{ s.t.} ||\beta||_1\leq b$$ If possible let this second formulation have a solution $\hat{\beta}$ such that $||\hat{\beta}||_1<||\beta^*||_1=b$ (note the strictly less than sign). Then it is easy to see that $f(\hat{\beta})<f(\beta^*)$ contradicting the fact that $\beta^*$ is a solution for the lasso. Thus, the solution to the first formulation is also $\beta^*$.
Since $t=b$, the complementary slackness condition is satisfied at the solution point $\beta^*$.
So, given a lasso formulation with $\lambda$, you construct a constrained formulation using a $t$ equal to the value of the $l_1$ norm of the lasso solution. Conversely, given a constrained formulation with $t$, you find a $\lambda$ such that the solution to the lasso will be equal to the solution of the constrained formulation.
(If you know about subgradients, you can find this $\lambda$ by solving the equation $X^T(y-X\beta^*)=\lambda z^*$, where $z^* \in \partial ||\beta^*||_1)$