Based on Alex Ravsky's hint, I have found the solution. I will type it up for the sake of reference.
We claim that the function
$$ f(x)=\frac{1}{x+1} $$
is a continuous function on $(-1, 1)$ that cannot be approximated by a polynomial. Assume not. Then, for $\epsilon=1$ in the definition of uniform convergence, there exists a polynomial $p(x)$ such that
$$
|f(x)-p(x)|\le 1
$$
for $all$ $x\in (-1, 1)$. Since the polynomial $p(x)$ is bounded on $(-1, 1)$, it follows that, there exists a constant $M$ such that $|p(x)|\le M$ for all $x\in (-1, 1)$. But then,
$$
|f(x)|\le |p(x)| + |f(x)-p(x)| \le M+1
$$
for all $x\in (-1,1)$ which contradicts the fact that $f(x)$ is unbounded on $(-1, 1)$.
In order to see why we use Bernstein polynomials, first note that what we essentially want to do is approximate the identity with polynomials, this will be easier to understand if you are familiar with convolutions, and approximations of the Dirac delta $\displaystyle\delta(x)$. We want to do a sort of convolution, or weighted average, something like $(p_{n,k}\ast f)(x)=\sum_{j=0}^np_{n,j}(x-y_j)f(y_j)$ of a sequence of polynomials $p_{n,k}$ of degree $n$ with a continuous function $f$, let's say on $[0,1]$, such that $p_{n,k}\ast f\to f$ as $n\to\infty$.
Being a weighted average, we should have $(p_{n,k}\ast 1)(x)=\sum_{j=0}^n p_{n,k}(y)=1$. So we want to write $1$ as a sum of polynomials of degree $n$.
The easiest way to do this is to notice that $1=1-x+x$, hence $\displaystyle 1^n=(1-x+x)^n=\sum_{k=0}^n {n\choose k}(1-x)^{n-k}x^k$. Now we want to insert $f(x)$ in here somehow so that when $n$ is large, $(p_{n,k}\ast f)(x)$ will be approximately equal to $f(x)$. Now if you plot $\displaystyle{n\choose k}(1-x)^{n-k}x^k$, you will see that it is largest when $\frac{k}{n}\approx x$, and for large $n$ $\displaystyle{n\choose k}(1-x)^{n-k}x^k\approx 0$ when $\frac{k}{n}$ is far from $x$. So if we put $\displaystyle(p_{n,k}\ast f)(x)=\sum_{k=0}^n {n\choose k}(1-x)^{n-k}x^k f\Big(\frac{k}{n}\Big)$, then for large $n$ the summands are $\approx 0$ when $\frac{k}{n}$ is far from $x$, so they hardly contribute to the sum, hence for large $n$ $\displaystyle(p_{n,k}\ast f)(x)=\sum_{k=0}^n {n\choose k}(1-x)^{n-k}x^k f\Big(\frac{k}{n}\Big)\approx f(x)\sum_{k=0}^n {n\choose k}(1-x)^{n-k}x^k=f(x)$.
To elaborate on why, for large $n$, $\displaystyle{n\choose k}(1-x)^{n-k}x^k$ is largest when $\frac{k}{n}\approx x$, you may want to read about the random walk if you don't already know it, but I will explain. Let's say you are dizzy, or in some confused state (it's usually said that you are drunk) and you are taking steps left and right, with a probability $x$ of going to the right, $1-x$ of going to the left, with each step. If you take a total of $n$ steps, then the probability of $k$ of those being to the right is $\displaystyle{n\choose k}(1-x)^{n-k}x^k$. Now $\frac{k}{n}$ is just the ratio of steps taken to the right to the number of steps taken in total. Now the probability of taking $k$ steps to the right is going to be highest when the ratio of steps taken to the right to the total number of steps taken is just equal to the probability of taking a step to the right; this is because you should expect that the number of steps taken to the right compared to the total number of steps taken is just the probability of going to the right, ie. when $\frac{k}{n}\approx x$. When the ratio of steps taken to the right to the total number of steps is far from the probability of going right, then $\displaystyle{n\choose k}(1-x)^{n-k}x^k$ will be $\approx 0$. And when you increase $n$ to a very large number, it becomes increasingly unlikely that you deviate from what should happen, this is the content of the law of large numbers, so these approximations get better and better.
Best Answer
No bounded continuous function from $\mathbb R$ into itself (other than a constant) can be approximated uniformly on $\mathbb R$ by polynomials. This is because any non-constant polynomial $p$ has the property $|p(x)| \to \infty$ as $x \to \infty$. In particular $\sin\, x$ can be approximated by polynomials uniformly on any interval of the type $[-N,N]$ but not on the whole real line.