Short answer: the minimal polynomial of $C$ is the monic lcm of the minimal polynomials of $A$ and $B$. And a square matrix is diagonalizable if and only if its minimal polynomial splits (which is automatic in $\mathbb{C}$ of course) with only simple roots. In other words, as pointed out by DonAntonio: if and only if its minimal polynomial is the product of pairwise distinct monic linear factors. Over the field under consideration, of course.
Now I'll give a detailed argument without explicit use of minimal polynomials.
Fact: a square matrix $M$ with coefficients in a field $K$ is diagonalizable if and only if there exists a nonzero polynomial $p(X)\in K[X]$ which splits over $K$ with simple roots and such that $p(M)=0$.
Proof: if $M$ is diagonalizable and if $\{\lambda_1,\ldots,\lambda_k\}$ is the set of its (non repeated) eigenvalues, then $p(X)=(X-\lambda_1)\cdots(X-\lambda_k)$ annihilates $M$. Conversely, if such a polynomial $p(X)$ with $\lambda_j$ pairwise distinct annihilates $M$, we have (by Bezout, essentially): $K^n=\mbox{Ker } p(M)=\bigoplus_{j=1}^k\mbox{Ker } (M-\lambda_j I_n)$. Diagonalizability follows easily. QED.
Now for every polynomial $p(X)$, you have
$$
p(C)=\left(\matrix{p(A)&0\\0&p(B)}\right)
$$
This gives you the annoying direction, namely $C$ diagonalizable implies $A$ and $B$ diagonalizable.
The converse is easier. Take $P$ and $Q$ invertible such that $PAP^{-1}$ and $QBQ^{-1}$ be diagonal. Then
$$
R:=\left(\matrix{P&0\\0&Q}\right)
$$
is invertible with
$$
R^{-1}=\left(\matrix{P^{-1}&0\\0&Q^{-1}}\right)\qquad \mbox{and}\qquad RCR^{-1}=\left(\matrix{PAP^{-1}&0\\0&QBQ^{-1}}\right)
$$
is diagonal.
Note: you can also do the converse with the fact above. Just take the lcm of the minimal polynomials of $A$ and $B$.
Your claim is true. It suffices to modify the proof in your question slightly. If the algebraic and geometric multiplicities of some eigenvalue $\lambda$ coincide, then $A=P\pmatrix{\lambda I\\ &M}P^{-1}$ for some change-of-basis matrix $P$ and some submatrix $M$ that is the direct sum of the Jordan blocks for other eigenvalues. Therefore $u^T=e_1^TP^{-1}$ and $v=Pe_1$ are respectively a left eigenvector and a right eigenvector corresponding to $\lambda$, with $u^Tv=1\ne0$.
$A$ and $A^T$ have identical eigenvalues basically because a matrix $B$ (take $B=\lambda I-A$ in your case) is singular if and only if its $B^T$ is singular. And the geometric reason for the latter to hold is that every linear map on a finite-dimensional vector space can be decomposed into the product of a number of shears, transpositions and also a scaling function. That is, $B=E_1E_2\ldots E_kDF_\ell\ldots F_2F_1$ where each $E_i$ or $F_j$ is either a shear matrix (an elementary row/column operation for row/column addition) or a transposition matrix (an elementary operation for exchanging two rows/columns) and $D$ is a scaling (diagonal) matrix. As shear matrices are nonsingular, $B$ is singular iff $D$ is singular iff $B^T$ is singular.
Best Answer
Since $A$ has distinct eigenvalues, it can be diagonalized as $A=X\Lambda X^{-1}$ where $x_i$ are the columns of $X$, and where $\lambda_i$ are the diagonal entries of the diagonal matrix $\Lambda$. Similarly $A^H = Y\Lambda^H Y^{-1}$. Thus, $X\Lambda X^{-1} = (Y^H)^{-1} \Lambda Y^H$ which can be rearranged to $$X^{-1} = \Lambda^{-1} X^{-1} (Y^H)^{-1} \Lambda Y^H = \Lambda^{-1} (Y^H X)^{-1} \Lambda Y^H.$$
Note that $Y^H X$ is a diagonal matrix with diagonal entries $y_i^H x_i$. (For why it is diagonal, see this question.)
To conclude, use the above expression to show that the entries of the vector $X^{-1} v$ are $\frac{y_i^H v}{y_i^H x_i}$. Do you see why this implies the result in your question?