Here is another way, (more matrix-heavy way) of proving it:
Outline:
Step 0: Every matrix $T$ associated with a linear operator on a finite-dimensional vector space over an algebraically closed field has at least one eigenpair ($\lambda$,$v$).**
Step 1: If $AB=BA$, then there exists one common eigenvector.
Step 2: By induction over the dimension $n$, we can conclude, the claim using the previous steps.
Proof of Steps:
Step 0:
$det($T$-\lambda I)$ is polynomial with coefficients from the algebraically closed field and thus has a root, which means there is at least one eigenvalue $\lambda$ and associated eigenvector $v$ for $T$.
Step 1:
Consider an eigenpair $(\lambda,v)$, thus $v \in \text{null}(A-\lambda I)$ (which exists, due to Step 0), then it holds $ (A-\lambda I) Bv = B (A-\lambda I) v = 0$, thus B is invariant with respect to $\text{null}(A-\lambda I)$. Thus, the restriction of the linear operator associated with $B$ with respect to $\text{null}(A-\lambda I)$ is a linear operator on $\text{null}(A-\lambda I)$, hence by applying Step 0 again, we know there exists an eigenvector $\tilde{v}$ of $B$ in $\text{null}(A-\lambda I)$ and therefore $\tilde{v}$ is a common eigenvector of $A$ and $B$.
Step 2:
Induction basis: Check it works for matrices with dimension $1\times 1$.
Induction step:
Let $AB=BA$ and be both matrices of $n\times n$ over the algebraically closed field, then we know from Step $1$, they have a common eigenvector $v$. Choose a basis $\{v,w_1,w_2, ...,w_{n-1}\}$ and transform both matrices with respect to this basis to get $\tilde{A}$, $\tilde{B}$, which look like this:
\begin{align}
\tilde{A} = \begin{bmatrix} \lambda_1& a^T_1\\0&\tilde{A}_1\end{bmatrix}
\end{align}
\begin{align}
\tilde{B} = \begin{bmatrix} \lambda_2& b^T_1\\0&\tilde{B}_1\end{bmatrix}
\end{align}
Now since $AB=BA$ then equivalently $\tilde{A}\tilde{B}=\tilde{B}\tilde{A}$, which gives us the following, using their mentioned structure:
\begin{align}
\begin{bmatrix} \lambda_2 \lambda_1& \lambda_2 a^T_1+b^T_1\tilde{A}_1\\0&\tilde{B}_1\tilde{A}_1\end{bmatrix} = \begin{bmatrix} \lambda_1 \lambda_2& \lambda_1 b^T_1+a^T_1\tilde{B}_1\\0&\tilde{A}_1\tilde{B}_1\end{bmatrix}
\end{align}
which implies $\tilde{A}_1\tilde{B}_1=\tilde{B}_1\tilde{A}_1$ for the lower dimensional matrices of $(n-1)\times(n-1)$. Now by using induction, you know that you can find new set of vectors $\{\tilde{w}_1,\tilde{w}_2, ...,\tilde{w}_{n-1}\}$ in the span of $\{w_1,w_2, ...,w_{n-1}\}$, which make $\tilde{B}_1,\tilde{A}_1$ simultaneously triangularizable and thus $\{v,\tilde{w}_1,\tilde{w}_2, ...,\tilde{w}_{n-1}\}$ makes the original $\tilde{B},\tilde{A}$ simultaneously triangularizable.
First, not all functions from the reals to the reals are "polynomial, triginometric, exponential". These form in fact a very tiny subset of all functions. A function is not a formula. It's just that for every $x \in \mathbb{R}$ there exists a unique value $f(x) \in \mathbb{R}$. The unicity is what makes it a function. The asignment of $f(x)$ to $x$ is completely arbitrary, e.g. I could take the 7-th digit in the decimal (infinite) representation of $x$ for all $x > \sqrt{17}$, the 5-th digit of that representation for all $x < \sqrt{17}$ and $f(\sqrt{17}) = \pi$. And this is even relatively nice, because I can write a program for it, but this need not be the case in general. A truly "random" function would just be an infinite list of values, one for each real, assigned all independently of each other. So there is no hope for Taylor expansions etc. Realise that $\mathbb{R}^\mathbb{R}$ is truly a huge set.
The operations on $\mathbb{R}^\mathbb{R}$ are pointwise, so if we have two such functions $f,g$ then we define $f+g$ as a new function, by telling what its value on an arbitrary $x \in \mathbb{R}$ is: $(f+g)(x) = f(x) + g(x)$, i.e. we just add the values of $f$ and $g$ at $x$. Similarly for a scalar $c \in \mathbb{R}$, we define $(c\cdot f)(x) = cf(x)$ for all $x \in \mathbb{R}$, where the latter is just standard multiplication in $\mathbb{R}$. The $0$ is just the function where all values are equal to $0$.
Both have infinite sets of linearly independent elements (or vectors, as elements of a vector space are called, even though they are not "vectors" in the old fashioned sense, like the functions in $\mathbb{R}^\mathbb{R}$): take the functions $f_p$, defined for a fixed $p \in \mathbb{R}$: $f_p(x) = 1$ if $x = p$, and $f_p(x) = 0$ if $x \neq p$. So all $0$ except for a spike at $p$.
Why are the $f_p$ linearly independent? By definition, we need to consider a finite linear combination of distinct $f_{p_i}$: $c_{p_1} \cdot f_{p_1} + \ldots + c_{p_n} \cdot f_{p_n} = 0$ (equality as functions, which means just that they have the same values for all $x$), and we need to show that then all $c_{p_i}$ are $0 \in \mathbb{R}$. Because the equality holds for all $x$, we can use $x = p_1$ in particular. Then $$(c_{p_1} \cdot f_{p_1} + \ldots + c_{p_n} \cdot f_{p_n})(p_1) = c_{p_1}f_{p_1}(p_1) + \ldots + c_{p_n} f_{p_n}(p_1) = 0$$ As $f_{p_2}(p_1) = 0$, as $p_2 \neq p_1$, and so on, but $f_{p_1}(p_1) = 1$ this comes down to $c_{p_1} = 0$. The same idea works for all other coefficients as well. So the set of $f_p$ is linearly independent.
The same idea works in $\mathbb{R}^\mathbb{N}$, the set of sequences, which look more like normal vectors, but of infinite length. We just see this as the set of functions from $\mathbb{N}$ to $\mathbb{R}$ and the same operations and independent functions apply. So both spaces are infinite-dimensional.
If we see $\mathbb{R}^\mathbb{N}$ as a set of functions (as we should) then the function $T$ is just the restriction of $f$ to $\mathbb{N}$. To see that this is linear, take $f,g \in \mathbb{R}^\mathbb{R}$. Then $T(f+g)$ is defined for all $n$ as $(f+g)(n)$, which is in turn defined as $f(n) + g(n)$, and this equals $T(f)(n) + T(g)(n)$, as $T$ does "nothing", it's just the restriction of a function to a smaller domain. The latter sum is just by definition $(T(f) + T(g))(n)$, and as this holds for all $n$ we have equality of functions (or sequences, because we index by $\mathbb{N}$) and $T(f+g) = T(f) + T(g)$. The same can be done for scalars as well.
A matrix for a linear map $T$ is formed by choosing bases for both spaces and computing the base expansion for every $T(b)$ for all basis elements in the domain (as the columns). But here a base for $\mathbb{R}^\mathbb{R}$ is uncountable, so we cannot write it down as a matrix: these can be at most countably infinite in both dimensions.
Best Answer
Hint: $k^{\oplus\mathbb N}$ has a countable basis, which is not the case for $k^\mathbb N$.