The general statement "If $S$ is a spanning set, and $L$ is linearly independent then $|L|\leq|S|$" is unfamiliar to me, and I'm not sure it was investigated enough to merit an answer either. It does imply that every two bases have the same cardinality, so it does require some weak form of the axiom of choice. My guess would be that it requires a bit more than just every two bases have the same cardinality.
However, it seems that you are interested in the particular case of a countable basis. That is, if there is a countable basis, does every linearly independent set inject into a spanning set? Well, we can do better. We can prove that every linearly independent is countable.
Claim. If $V$ is a vector space over $K$, and $B$ is a countable basis for $V$, then every linearly independent set have size $\leq\aleph_0$.
If $B$ Is finite, then we know it to be true, and it is not the interesting case anyway.
Fix an enumeration of $B=\{b_n\mid n\in\Bbb N\}$. And let $L$ be a linearly independent set, and assume by contradiction that $|L|\nleq\aleph_0$. For every $\ell\in L$ take $\langle\ell_n\mid n\in\Bbb N\rangle$ to be the sequence of coefficients for the unique sum of $\ell$ in term of $b_n$'s. Note that all but finitely many are zero.
Clearly $\ell\mapsto\langle\ell_n\mid n\in\Bbb N\rangle$ is injective. Since $L$ is not countable, there is an uncountable subset of elements which have exactly the same non-zero indices, so without loss of generality it will be all of $L$, and without loss of generality those indices are $0,\ldots,n-1$.
Therefore $L$ itself is linearly independent as a subset of $W=\operatorname{span}\{b_i\mid i<n\}$. But this is a contradiction since $W$ has a finite dimension, and therefore every linearly independent set is finite, but $L$ is uncountable. $\quad\square$
First, not all functions from the reals to the reals are "polynomial, triginometric, exponential". These form in fact a very tiny subset of all functions. A function is not a formula. It's just that for every $x \in \mathbb{R}$ there exists a unique value $f(x) \in \mathbb{R}$. The unicity is what makes it a function. The asignment of $f(x)$ to $x$ is completely arbitrary, e.g. I could take the 7-th digit in the decimal (infinite) representation of $x$ for all $x > \sqrt{17}$, the 5-th digit of that representation for all $x < \sqrt{17}$ and $f(\sqrt{17}) = \pi$. And this is even relatively nice, because I can write a program for it, but this need not be the case in general. A truly "random" function would just be an infinite list of values, one for each real, assigned all independently of each other. So there is no hope for Taylor expansions etc. Realise that $\mathbb{R}^\mathbb{R}$ is truly a huge set.
The operations on $\mathbb{R}^\mathbb{R}$ are pointwise, so if we have two such functions $f,g$ then we define $f+g$ as a new function, by telling what its value on an arbitrary $x \in \mathbb{R}$ is: $(f+g)(x) = f(x) + g(x)$, i.e. we just add the values of $f$ and $g$ at $x$. Similarly for a scalar $c \in \mathbb{R}$, we define $(c\cdot f)(x) = cf(x)$ for all $x \in \mathbb{R}$, where the latter is just standard multiplication in $\mathbb{R}$. The $0$ is just the function where all values are equal to $0$.
Both have infinite sets of linearly independent elements (or vectors, as elements of a vector space are called, even though they are not "vectors" in the old fashioned sense, like the functions in $\mathbb{R}^\mathbb{R}$): take the functions $f_p$, defined for a fixed $p \in \mathbb{R}$: $f_p(x) = 1$ if $x = p$, and $f_p(x) = 0$ if $x \neq p$. So all $0$ except for a spike at $p$.
Why are the $f_p$ linearly independent? By definition, we need to consider a finite linear combination of distinct $f_{p_i}$: $c_{p_1} \cdot f_{p_1} + \ldots + c_{p_n} \cdot f_{p_n} = 0$ (equality as functions, which means just that they have the same values for all $x$), and we need to show that then all $c_{p_i}$ are $0 \in \mathbb{R}$. Because the equality holds for all $x$, we can use $x = p_1$ in particular. Then $$(c_{p_1} \cdot f_{p_1} + \ldots + c_{p_n} \cdot f_{p_n})(p_1) = c_{p_1}f_{p_1}(p_1) + \ldots + c_{p_n} f_{p_n}(p_1) = 0$$ As $f_{p_2}(p_1) = 0$, as $p_2 \neq p_1$, and so on, but $f_{p_1}(p_1) = 1$ this comes down to $c_{p_1} = 0$. The same idea works for all other coefficients as well. So the set of $f_p$ is linearly independent.
The same idea works in $\mathbb{R}^\mathbb{N}$, the set of sequences, which look more like normal vectors, but of infinite length. We just see this as the set of functions from $\mathbb{N}$ to $\mathbb{R}$ and the same operations and independent functions apply. So both spaces are infinite-dimensional.
If we see $\mathbb{R}^\mathbb{N}$ as a set of functions (as we should) then the function $T$ is just the restriction of $f$ to $\mathbb{N}$. To see that this is linear, take $f,g \in \mathbb{R}^\mathbb{R}$. Then $T(f+g)$ is defined for all $n$ as $(f+g)(n)$, which is in turn defined as $f(n) + g(n)$, and this equals $T(f)(n) + T(g)(n)$, as $T$ does "nothing", it's just the restriction of a function to a smaller domain. The latter sum is just by definition $(T(f) + T(g))(n)$, and as this holds for all $n$ we have equality of functions (or sequences, because we index by $\mathbb{N}$) and $T(f+g) = T(f) + T(g)$. The same can be done for scalars as well.
A matrix for a linear map $T$ is formed by choosing bases for both spaces and computing the base expansion for every $T(b)$ for all basis elements in the domain (as the columns). But here a base for $\mathbb{R}^\mathbb{R}$ is uncountable, so we cannot write it down as a matrix: these can be at most countably infinite in both dimensions.
Best Answer
Suppose that $V$ is a vector space over $F$ and $V$ has a basis $B$.
From the definition of a basis every $v\in V$ can be written as a unique sum of basis elements and scalars. That is, there is a finite subset of $B\times (F\setminus\{0\})$ whose sum is $v$, and if we require that this set is a function on its domain, then this set is unique.
This gives a well-defined injection from $V$ into finite subsets of $B\times(F\setminus\{0\})$. Assuming the axiom of choice we have that, $$|V|\leq\left|[B\times(F\setminus\{0\})]^{<\omega}\right|=|B\times F|=\max\{|B|,|F|\}\leq|V|\implies|V|=\max\{|B|,|F|\}.$$