Yes, you can use a Cholesky factorization to solve this, by solving the so-called normal equations (that you offer above) directly. First, you must form the $n\times n$ matrix $Z=X^TWX$, then compute its Cholesky factorization $R^TR=Z$, where $R$ is upper triangular. Then $\beta=R^{-1}R^{-T}X^TWY$.
This is generally frowned upon, due to the lower numerical stability compared to other approaches, but it is often the cheapest, and if the normal equations are well-conditioned, it's fine. Indeed, many convex optimization methods use a normal equations approach within their iterative algorithms.
The most commonly offered "stable" method for least squares uses the QR factorization of $W^{1/2}X$, where $W^{1/2}$ satisfies $W^{1/2}W^{1/2}=W$. Since $W$ is diagonal, this is easy to form (in fact, you never need $W$ again if you stick with $W^{1/2}$). Note that $R$ here is the same as the Cholesky factor above, but this is a more stable, albeit more expensive, way to compute it. So again we have $\beta=R^{-1}R^{-T}X^TWY$. But since $X^T=R^TQ^TW^{-1/2}$, this simplifies to $\beta=R^{-1}Q^TW^{1/2}Y$.
For even more numerical stability, particularly if $W$ has a high condition number, you may want to use column pivoting; that is, $QRP^T=XW^{1/2}$, where $P$ is a permutation matrix chosen using the standard column pivoting process. Then $\beta=PR^{-1}Q^TW^{1/2}Y$.
One way to improve the accuracy of the results is with iterative refinement (look it up!). This is particularly useful for the normal equations approach, but it can be employed in all three options above. It is inexpensive compared to the initial solve step, since it can re-use the Cholesky or QR factorization.
I do not believe there is an easy way to reuse a factorization of $X$ or $Y$ to speed things up.
The definition of SVD wants $U$ to be square (and unitary); if the $U$ you get from the first part of the procedure is not square you need to add some columns to make it square.
There's nothing inherently wrong with writing $A=U\Sigma V^*$ with a non-square $U$; it is just not what the SVD promises to give as a result.
Best Answer
Yes. Choose orthogonal matrices $U$ and $V$ a diagonal matrix $\Sigma$ with positive and distinct diagonal entries. Explicitly form the product $A = U \Sigma V^T$ and run this matrix through your solve. Note that your matrices need not be square as long as the product is defined. Verify that you recover the singular values and the direction of the singular vectors. Repeat with a different choice of $U$, $V$ and $\Sigma$. By changing the diagonal entries of $\Sigma$ you can generate matrices which are either well-conditioned or ill-conditioned. The choice of singular values which are not distinct is a subject which can wait until another day.