Kelley (General Topology, p. 244/245 in exercise R of the chapter on function spaces) formulates it as follows:
If $X$ is a topological space and the family $C(X)$ of all continuous real-valued functions on $X$ is given the topology of uniform convergence on compacta (which is the compact-open topology according to Thm. 11 in the same chapter), then each subalgebra of $C(X)$ that has the two-point property is dense in $C(X)$.
(the two-point property being that for each $x \neq y$ in $X$ and any two reals $a,b$ there is an $f \in C(X)$ with $f(x) =a \land f(y)=b$; it's implied by all constant functions being in the subalgebra plus the subalgebra separating points, according to the exercise's preamble, and that is indeed easy to see, and baby Rudin shows that nowhere vanishing plus separating points together also imply it; this shows that you don't always need all constant functions to be in the subalgebra). Having constants + nowhere vanishing together might not imply it, though, not sure..
He refers to
M.H. Stone The generalized Weierstrass approximation theorem, Math. Mag. 21 (1948) 167-184, 254-273
for further discussion (so I presume there's a proof or reference to it there).
I couldn't find a similarly broad theorem in Engelking (usually my go-to book), though he refers to the same paper, as Willard does too BTW (!)
The encyclopedia of general topology's chapter on function spaces mentions (p151 section7, almost as an aside) that the theorem holds for general spaces and subalgebras with constant functions and separating points (so implying the two-point property again) in the compact-open topology, but gives no explicit reference for it. The three general references it does give might help, though.
Maybe the proof reduces to the compact case (for which Engelking has a full proof, and Kelly extensive hints) in some way? I'd look for the MH Stone paper and see what it has. It might be the original reference for all of this. (it isn't called the Stone-Weierstrass theorem for nothing: Weierstrass did the special case of polynomaials on a compact real interval, nothing as general as Stone's contribution).
Here's what I would do. Let $f\in\mathcal{A}$ be a nowhere vanishing function. Replacing $f$ with $f^2$, we may assume $f>0$ everywhere, and scaling $f$, we may assume $f\geq 1$ everywhere. Let $g:\mathbb{R}\to\mathbb{R}$ be a continuous function such that $g(0)=0$ and $g(x)=1$ for all $x\geq 1$. By the Weierstrass approximation theorem, we can uniformly approximate $g$ with polynomials $p_n$ on the interval $[0,\|f\|]$. Subtracting the constant terms from these $p_n$, we get polynomials $q_n$ which still uniformly approximate $g$, since the constant terms are the values $p_n(0)$ which converge to $g(0)=0$. Since $q_n$ does not have a constant term, $q_n(f)\in\mathcal{A}$ for each $n$. Since $f$ takes values in $[1,\|f\|]$ everywhere, $q_n(f)$ converges uniformly to $g(f)$ which is just the constant function $1$.
Best Answer
The Stone-Weierstrass theorem can be generalized in various ways, as discussed below (mostly based on General Topology by Willard, section 44). But presenting things in the greatest possible generality usually goes counter to the purpose of writing a textbook. Textbook authors are more concerned with presenting an insightful, digestible proof, and with proving the results that are actually used later.
Functions vanishing at infinity
The easiest generalization is to consider the algebra $C_0(X)$ of functions "vanishing at infinity" on a locally compact space $X$. This is the form that Wikipedia presents. It's not really much of a generalization, since one can consider the one-point compactification of $X$, denoted $\widehat {X}=X\cup\{\infty\}$, and extend functions to $\infty$ by zero. By including the constant function in subalgebra $A$ one gets to the point where the compact case of Stone-Weierstrass can be used. Then one observes that to approximate a function that vanishes at $\infty$, one does not need the constant functions after all.
Bounded functions
The space of all bounded real-valued continuous functions $C_b(X)$ still has the uniform norm, so one can hope to generalize the Stone-Weierstrass theorem verbatim. This does not actually work, though: let $X=[0,\infty)$ and denote by $A$ the algebra of all continuous functions $f\colon X\to\mathbb R$ such that $\lim_{x\to\infty} f(x)$ exists. This is a closed subalgebra which vanishes nowhere and separates points, but it does not coincide with $C_b(X)$; for example $\sin x\notin A$.
To state a version of Stone-Weierstrass theorem for this case, let's say that $Z\subset X$ is a zero set if there exists a continuous function $f:X\to\mathbb{R}$ such that $Z=f^{-1}(0)$. An algebra $A$ separates zero sets if for any two disjoint zero sets $Z_1$ and $Z_1$ there is $f\in A$ such that $\overline{f(Z_1)}\cap \overline{f(Z_2)}$ is empty.
Theorem. Suppose $X$ is a Tychonoff space (also known as $T_{3\frac12}$ space). If an algebra $A\subset C_b(X)$ separates zero sets, contains constant functions, and is closed in the uniform norm, then $A=C_b(X)$.
The proof still involves compactification of $X$, but this time one needs the Stone–Čech compactification.
Compact-open topology
If one considers $C(X)$, the set of all continuous functions on a non-compact space $X$, then that's no longer a normed space. While uniform convergence still makes sense, one can't hope to have uniform approximation here: for example, $f(x)=e^x$ cannot be uniformly approximated by polynomials on $\mathbb{R}$. A natural decision is to equip $C(X)$ with the compact-open topology; i.e., it's a locally convex space with topology generated by seminorms $\|f\|_K = \sup_K|f|$, for all compact subsets $K\subset X$.
Theorem. Suppose $X$ is a Tychonoff space. If $A\subset C(X)$ is an algebra that is closed in the compact-open topology, separates the points of $X$, and contains the constant functions, then $A=C(X)$.
For example, this implies that for every continuous function $f:\mathbb{R}\to\mathbb{R}$ there is a sequence of polynomials $p_n$ such that $$ \forall M\ \lim_{n\to\infty}\sup_{|x|\le M}|f(x)-p_n(x)| =0 $$
Topology of uniform convergence
Willard presents two more forms of the Stone-Weierstrass theorem, using the topology of uniform convergence on $C(X)$ (which makes it a very disconnected topological space). They impose strong additional assumptions on the algebras of functions and are too involved to be stated here.