Well, let's try to avoid the hat.
Consider the dual (and obviously equivalent) problem: find the polynomial $p(x):[-1,1]\rightarrow [-1,1]$ of degree $n$ with the greatest possible leading coefficient. We have some information on values of $p$, and need something about its coefficient. Let's try Lagrange's interpolation. Take some $n+1$ values $t_1 < t_2 < \dots < t_{n+1}$ from $[-1,1]$ and write down (for $u(x)=(x-t_1)\dots(x-t_{n+1})$) the formula
$$
p(x)=\sum p(t_i) \frac{u(x)/(x-t_i)}{u'(t_i)}.
$$
Then take a look on coefficient of $x^n$. It equals
$$
\sum \frac{p(t_i)}{u'(t_i)}.
$$
We know that $|p(t_i)|\leq 1$, so the leading coefficient does not exceed
$
\sum 1/|u'(t_i)|.
$
Ok, when does equality occur? The answer is: $p$ should take values $(-1)^{n-i+1}$ in $t_i$. That is, we have to find a polynomial of degree $n$ with $n+1$ extremal values $\pm 1$ on $[-1,1]$. This may hold only if $t_1=-1$, $t_{n+1}=1$, and $t_2$, $\dots$, $t_n$ are roots of $p'$. So, $1-p^2(x)$ should be divisible by $(1-x^2)p'(x)$. Hereon the trigonometic substitution $x=\cos t$, $p=\cos f$ is very natural, as we know that $1-f^2$ is divisible by $f'$ for $f=\cos$. So we invent Chebyshev's polynomials.
Also, it is seen from Lagrange formula that they are extremal in many other problems with restrictions $|p(x)|\leq 1$ on $[-1,1]$. For example, the value in each specific point $x_0>1$ is maximized also for Chebyshev polynomial, it is proved by exactly the same way.
Regarding which point of view is preferable, I don't see that there's much difference between them. The Chebyshev polynomials map $[-1,1]$ to $[-1,1]$, the spread polynomials map $[0,1]$ to $[0,1]$, and they are conjugate under a linear map between $[-1,1]$ and $[0,1]$, so all their properties translate easily between the two frameworks.
However, I'd vote for Chebyshev polynomials as being somewhat more fundamental, due to orthogonality. The spread polynomials aren't orthogonal with respect to any measure, because they are nonnegative everywhere. To get orthogonality, one must subtract $1/2$, after which they become orthogonal with respect to $dx/\sqrt{x(1-x)}$ on the interval $[0,1]$. By contrast, the Chebyshev polynomials are already orthogonal with respect to $dx/\sqrt{1-x^2}$ on $[-1,1]$, with no subtraction needed. This isn't a big deal, since it just amounts to subtracting $1/2$, but it's nice not to have to do the subtraction.
Overall, there's nothing sacred about using the domain $[-1,1]$ for Chebyshev polynomials. Of course it aligns beautifully with trigonometry, but Chebyshev polynomials are important in many other settings (such as approximation theory) in which $[-1,1]$ plays no special role, and they are simply rescaled to fit the interval of interest. From that perspective, $[0,1]$ is just as good a domain. On the other hand, I see no gain from making the range $[0,1]$ as well, and one has to undo it to recover orthogonality.
Comments added in edit:
As for the factorizations, this amounts to factoring $T_n(x)$ (for Chebyshev polynomials) or $T_n(x)+1$ (for spread polynomials - not quite, see comments below). Both are interesting, since both the roots and the extrema of the Chebyshev polynomials are important.
In fact, $T_{2n}(x)= T_2(T_n(x))$ and hence $T_{2n}(x)+1 = 2T_n(x)^2$, so factoring spread polynomials includes factoring Chebyshev polynomials as the even-index case. (In the odd-index case, $T_{2n+1}(x)+1 = (T_{n+1}(x)+T_n(x))^2/(x+1)$, but I'm not certain how to interpret this.)
So I'd say factoring spread polynomials is more general but slightly more obscure. Definitely both are interesting, though.
There are combinatorial interpretations of the Chebyshev polynomials involving weighted monomer-dimer configurations (although the conditions are a little odd: see http://www.math.hmc.edu/~benjamin/papers/CombTrig.pdf). The analogous idea doesn't work out as nicely for spread polynomials, but maybe some other approach is more appropriate. It's worth noting that the Chebyshev polynomials have somewhat simpler coefficients. For example, $T_8(x)=128x^8-256x^6+160x^4-32x^2+1$ while $S_8(x)=-16384x^8 + 65536x^7 - 106496x^6 + 90112x^5 - 42240x^4 + 10752x^3 - 1344x^2 + 64x$.
Best Answer
The Chebyshev polynomials first appeared in his paper Théorie des mécanismes connus sous le nom de parallélogrammes (1854). The remarkable "mechanisms" described in this work can be seen in action here (click on each picture to activate it).
The context is described in MacTutor:
For a more extensive account of the history of this discovery, see The theory of best approximation of functions.