I believe your proof is valid. The only correction I'd make is that they are spanning not because we kept the components connected, but because we only removed edges (and not vertices) in order to obtain them (which is exactly the definition of "spanning").
They are trees because we destroyed every cycle while keeping the components connected.
Finally, the subgraph we obtain is maximally acyclic, because any other subgraph that contains it has to have a cycle.
We have to prove two things: (1) the algorithm gives us a valid spanning tree. (2) the resulting sum of edgecosts is minimal over all possible spanning trees.
Let $T$ be the collection of edges returned by the algorithm:
For (1): It is clear that the algorithm returns exactly $n-1$ edges where $n$ is the number of nodes. Now let $T_u$ be the undirected set of edges arising from $T$. Then $(V, T_u)$ cannot contain any cycles because of the following argument: Assume there is a cycle in $(V, T_u)$. Then there exist nodes $u, v$ such that there are two $u$-$v$-paths in $T$. But this would mean that $deg_T(v) > 1$ which is not possible given the definition of the algorithm. Since we have $n-1$ edges on $n$ nodes with no cycles it follows that $T_u$ is a spanning tree of $G$.
For (2): Note that this does not hold. I give a counterexample: Let $$V = \{A, B, C, D\}$$ $$E = \{ (A, B), (A, C), (B, D), (C, D)\}$$ $$c((A, B)) = 10, c((A, C)) = 10, c((B, D)) = 1, c((C, D)) = 1$$
The minimum spanning tree is given by either $T = \{ (A, B), (B, D), (C, D)\}$ or $T = \{ (A, C), (B, D), (C, D)\}$. But neither of those can be returned by your algorithm because the in-degree of $D$ is $2$ in both cases.
Best Answer
One way to make the graph acyclic is to first pick an arbitrary ordering of the vertices (imagine them being lined up left to right). For each pair of vertices $v,w$ that had an edge between them in the original graph, you're really thinking of that as a pair of directed edges: $v \to w$ and $w \to v$. Of these two edges, keep only the one that goes left to right.
This will be acyclic, because any directed path keeps going left, and therefore can't return to where you started. It's also the directed acyclic subgraph with the most arcs: you can't keep both arcs $v \to w$ and $w \to v$, because that would be a cycle of length $2$, so you can keep at most one of the arcs - and this algorithm keeps exactly one. Whichever node you put first will be the root.
Regarding connectivity, there are three options:
For option 3, note that if you have two nodes that are consecutive in the ordering, we can't hope for a path from the right node to the left node, and the only way we can hope for a left-to-right path is if there's an arc between the nodes. So we can only get an ordering of this type if the original undirected graph has a Hamiltonian path: a path that visits every vertex exactly once. (Then, we just take that path as our ordering.)
Unfortunately, finding a Hamiltonian path (or checking that there is one) is a well-known hard computational problem. But it's one you have to solve if you want the connectivity.