I think the answer depends on what kind of accuracy you're looking for
and exactly what function you're dealing with.
The sine function happens to be a relatively "nice" function to do a Maclaurin series for.
Recall that if we include terms up to $x^n$ in the Maclaurin series
of a function $f(x)$, the error term is
$$
\frac{1}{(n+1)!} f^{(n+1)}(\xi)\, x^{n+1},
\text{for some $\xi$ such that $0<\xi<x$.}
$$
If $f(x) = \sin(x)$,
we know $\left\lvert f^{(n+1)}(\xi) \right\rvert \leq 1$, so
as soon as $n>x$, as $n$ continues to increase the error will get
smaller and smaller, eventually shrinking to zero.
Even if $x$ is very large, it just means you need very large values
of $n$.
Even for a function like $f(x) = \sin(2x)$, where the higher derivatives can get quite large, we still have
$\left\lvert f^{(n+1)}(\xi) \right\rvert \leq 2^{n+1}$,
and the factorial $n!$ still grows faster than $2^n$.
We might need to go to $n>2x$ before the approximation is really any good, but that's OK.
But not every function is as "nice" as the sine.
Obvious problems occur with functions that have asymptotes,
such as $f(x) = 1/(x+1)$.
The Maclaurin series for this function, $1-x+x^2-x^3+x^4-x^5+O(x^6)$,
has a radius of convergence equal to $1$, so the series simply does not converge for $x\leq -1$ nor for $x > 1$. Adding more terms will get you a better approximation within the interval $-1 < x \leq 1$, but not generally outside that interval.
Another limitation of the Maclaurin series is that it is based entirely on the derivatives of the desired function at zero.
A function does not need to be discontinuous (like $1/(x+1)$)
or have an undefined $n$th derivative
in order to be "not nice" for the Maclaurin series;
it can do other unpleasantly surprising things.
Such a function is described on page 86 of
an online textbook by John K. Hunter. That function is
$$
\phi(x) = \begin{cases} e^{-1/(x)} & \text{if $x > 0$,}\\
0 & \text{if $x \leq 0$.} \end{cases}
$$
All the derivatives of this function are defined everywhere, but every derivative at $x=0$ is zero, and of course at $x<0$ the derivatives are all zero as well. So the Maclaurin series of $\phi(x)$ is just a constant zero, which is a perfect match for the function for $x < 0$ but not for $x > 0$, and the approximation never gets better as we add more terms.
It's as if the function $\phi(x)$ somehow "fools" the Maclaurin series by "sneakily" approaching $1$ at large values of $x$ without giving any clue (in its derivatives at zero) that it was going to do anything but remain a constant zero as $x$ increased.
Now consider the Maclaurin series for
$$
g(x) = \sin(x) + \phi(x-1)
= \begin{cases} \sin(x) + e^{-1/(x-1)} & \text{if $x > 1$,}\\
\sin(x) & \text{if $x \leq 1$.} \end{cases}
$$
That's easy: it's the same as the Maclaurin series for $\sin(x)$, which we already know. But look what happens at $x=2\pi$, for example:
the more terms we add to the series, the better it approximates
$\sin(2\pi)=0$ at $x=2\pi$, but the actual value of $g(2\pi)$ is about $0.828$.
So the more
As some people on this site might be aware I don't always take downvotes well. So here's my attempt to provide more context to my answer for whoever decided to downvote.
Note that I will confine my discussion to functions $f: D\subseteq \Bbb R \to \Bbb R$ and to ideas that should be simple enough for anyone who's taken a course in scalar calculus to understand. Let me know if I haven't succeeded in some way.
First, it'll be convenient for us to define a new notation. It's called "little oh" notation.
Definition: A function $f$ is called little oh of $g$ as $x\to a$, denoted $f\in o(g)$ as $x\to a$, if
$$\lim_{x\to a}\frac {f(x)}{g(x)}=0$$
Intuitively this means that $f(x)\to 0$ as $x\to a$ "faster" than $g$ does.
Here are some examples:
- $x\in o(1)$ as $x\to 0$
- $x^2 \in o(x)$ as $x\to 0$
- $x\in o(x^2)$ as $x\to \infty$
- $x-\sin(x)\in o(x)$ as $x\to 0$
- $x-\sin(x)\in o(x^2)$ as $x\to 0$
- $x-\sin(x)\not\in o(x^3)$ as $x\to 0$
Now what is an affine approximation? (Note: I prefer to call it affine rather than linear -- if you've taken linear algebra then you'll know why.) It is simply a function $T(x) = A + Bx$ that approximates the function in question.
Intuitively it should be clear which affine function should best approximate the function $f$ very near $a$. It should be $$L(x) = f(a) + f'(a)(x-a).$$ Why? Well consider that any affine function really only carries two pieces of information: slope and some point on the line. The function $L$ as I've defined it has the properties $L(a)=f(a)$ and $L'(a)=f'(a)$. Thus $L$ is the unique line which passes through the point $(a,f(a))$ and has the slope $f'(a)$.
But we can be a little more rigorous. Below I give a lemma and a theorem that tell us that $L(x) = f(a) + f'(a)(x-a)$ is the best affine approximation of the function $f$ at $a$.
Lemma: If a differentiable function $f$ can be written, for all $x$ in some neighborhood of $a$, as $$f(x) = A + B\cdot(x-a) + R(x-a)$$ where $A, B$ are constants and $R\in o(x-a)$, then $A=f(a)$ and $B=f'(a)$.
Proof: First notice that because $f$, $A$, and $B\cdot(x-a)$ are continuous at $x=a$, $R$ must be too. Then setting $x=a$ we immediately see that $f(a)=A$.
Then, rearranging the equation we get (for all $x\ne a$)
$$\frac{f(x)-f(a)}{x-a} = \frac{f(x)-A}{x-a} = \frac{B\cdot (x-a)+R(x-a)}{x-a} = B + \frac{R(x-a)}{x-a}$$
Then taking the limit as $x\to a$ we see that $B=f'(a)$. $\ \ \ \square$
Theorem: A function $f$ is differentiable at $a$ iff, for all $x$ in some neighborhood of $a$, $f(x)$ can be written as
$$f(x) = f(a) + B\cdot(x-a) + R(x-a)$$ where $B \in \Bbb R$ and $R\in o(x-a)$.
Proof: "$\implies$": If $f$ is differentiable then $f'(a) = \lim_{x\to a} \frac{f(x)-f(a)}{x-a}$ exists. This can alternatively be written $$f'(a) = \frac{f(x)-f(a)}{x-a} + r(x-a)$$ where the "remainder function" $r$ has the property $\lim_{x \to a} r(x-a)=0$. Rearranging this equation we get $$f(x) = f(a) + f'(a)(x-a) -r(x-a)(x-a).$$ Let $R(x-a):= -r(x-a)(x-a)$. Then clearly $R\in o(x-a)$ (confirm this for yourself). So $$f(x) = f(a) + f'(a)(x-a) + R(x-a)$$ as required.
"$\impliedby$": Simple rearrangement of this equation yields
$$B + \frac{R(x-a)}{x-a}= \frac{f(x)-f(a)}{x-a}.$$ The limit as $x\to a$ of the LHS exists and thus the limit also exists for the RHS. This implies $f$ is differentiable by the standard definition of differentiability. $\ \ \ \square$
Taken together the above lemma and theorem tell us that not only is $L(x) = f(a) + f'(a)(x-a)$ the only affine function who's remainder tends to $0$ as $x\to a$ faster than $x-a$ itself (this is the sense in which this approximation is the best), but also that we can even define the concept differentiability by the existence of this best affine approximation.
Best Answer
Given a function $f$, polynomials $p_1$ and $p_2$, and some $x_0$, we can define "better" as meaning that there is a neighborhood in which it is a better approximation. That is, if there exists $\epsilon$ such that $(|x-x_0|<\epsilon) \rightarrow (|p_1(x)-f(x)|<|p_2(x)-f(x)|)$, then near $x_0$, $p_1$ is a better approximation to $f$ than $p_2$ is.
So Taylor polynomials can, with this definition, be said to be better than any other polynomial with the same order. That is, if $f$ is analytic and $T_n$ is the $n$th order Taylor polynomial of $f$, then for all $n$th order polynomials $g$, there exists a neighborhood around $x_0$ such that $T_n$ is better than $g$.