First of all, the Peter-Weyl Theorem is about unitary representations, it will be of no use if you want to study representations on topological vector spaces which are not isomorphic to Hilbert spaces (or pre-Hilbert spaces). Thus, formally speaking, the answer to your question (at the end of the 1st paragraph) is negative. Nevertheless, one has:
Theorem. Let $V$ be a Hausdorff locally convex quasicomplete topological vector space, $G$ a compact (and Hausdorff) topological group and $\rho: G\to Aut(V)$ a continuous irreducible representation (meaning that $V$ contains no proper closed invariant subspaces). Then $V$ is finite-dimensional.
You can find a proof in
R.A.Johnson, Representations of compact groups on topological vectors spaces: some remarks, Proceedings of AMS, Vol. 61, 1976.
The point of considering this class of topological vector spaces (which includes, for instance, all Banach spaces) is that one has a satisfactory theory of integration for maps to such spaces. As for more general vector spaces, I have no idea, I suppose that the finite-dimensionality claim is simply false. For instance, if you drop the assumption that $V$ is Hausdorff (which one usually assumes) and take a vector space with trivial topology, you will have irreducible representations of any group on such a vector space.
One last thing: In the context of Hilbert spaces, there is a very short and direct (avoiding PWT) proof of finite dimensionality, given in this mathoverflow post.
Proof is fine for finite groups, assuming you know $\rho(g)$s are diagonalizable in the first place.
For infinite $G$, each $\rho(g)$ induces a decomposition of $V$ into subspaces, and any two of these decompositions have a mutual refinement. If $V$ is finite-dimensional, among these there must be a maximal refinement (you can't keep refining forever) with respect to which all the operators are diagonalizable. Hope this works.
Best Answer
Okay, we're going to have to use some heavy artillery to start off, but I can't think of another way to begin.
Suppose $\rho: G\to \text{GL}_{2} (\mathbb{C})$ is nontrivial. Observe that since $G$ is simple and the representation is nontrivial, we must have $\text{ker} \, \rho =\text{ker}\, \chi = (e)$ (where $\chi$ is the character of this representation). The Feit-Thompson Theorem (!!!) tells us $|G|$ is even. By Cauchy's Theorem, $G$ must have an element $x$ of order $2$.
Now, define $$\hat{\rho}: G \to \text{GL}_{1} (\mathbb{C}) \cong \mathbb{C}^{\times}$$ by $\hat{\rho}(g) = \text{det} (\rho(g))$. Evidently, $\hat{\rho}$ is a homomorphism, hence it gives a degree 1 representation of $G$. We know this representation must be trivial. In other words, $\text{det} (\rho(g)) = 1$ for all $g\in G$. That said, we also know that $\rho(x)^2 = \text{Id}$. The set of eigenvalues of $\rho(x)$ is either $\{1, 1\}$, $\{1,-1\}$, or $\{-1,-1\}$. The first possibility is out of the question, since $\text{ker} \chi = (e)$. The second possibility cannot occur, since then $\text{det} (\rho(x)) = -1$. Thus, the eigenvalues of $\rho(x)$ are $\{-1, -1\}$. The characteristic polynomial of $\rho(x)$ is $(X+1)^2$, and $\rho(x)$ also satisfies $X^2 - 1$. Since the minimal polynomial of $\rho(x)$ must divide both of these, it follows $\rho(x)$ satisfies $X+1$, i.e. $\rho(x) = -\text{Id}$.
Lastly, since $\rho(x)$ is a scalar multiple of the identity, it commutes with any matrix. In particular, for any $g\in G$, we have $$\rho(g) \rho(x) = \rho(x) \rho(g) \implies \rho(gxg^{-1} x^{-1}) = \text{Id}$$ Triviality of $\text{ker} \, \rho$ implies $gxg^{-1} x^{-1} = e$ for all $g\in G$, hence $x\in Z(G)$. Accordingly, $Z(G)$ is a nontrivial normal subgroup of $G$, so it must equal $G$. But $G$ is non-abelian by assumption.