Polynomial regression is in effect multiple linear regression: consider $X_1=X$ and $X_2=X^2$ -- then $E(Y) = \beta_1 X + \beta_2 X^2$ is the same as $E(Y) = \beta_1 X_1 + \beta_2 X_2$.
As such, methods for constructing confidence intervals for parameters (and for the mean in multiple regression) carry over directly to the polynomial case. Most regression packages will compute this for you. Yes, it can be done using the formula you suggest (if the assumptions needed for the t-interval to apply hold), and the right d.f. are used for the $t$ (the residual d.f. - which in R is available from the summary output).
The R function confint
can be used to construct confidence intervals for parameters from a regression model. See ?confint
.
In the case of a confidence interval for the conditional mean, let $X$ be the matrix of predictors, whether for polynomial regression or any other multiple regression model; let the estimated variance of the mean
at $x_i=(x_{1i},x_{2i},...,x_{pi})$ be $v_i=\hat{\sigma}^2x_i(X'X)^{-1}x_i'$
and let $s_i=\sqrt v_i$ be the corresponding standard error.
Let the upper $\alpha/2$ $t$ critical value for $n-p-1$ df be $t$.
Then the pointwise confidence interval for the mean at $x_i$ is $\hat{y}_i\pm t\cdot s$.
Also, the R function predict
can be used to construct CIs for E(Y|X) - see ?predict.lm
.
[At least when doing polynomial regression with an intercept, it makes sense to use orthogonal polynomials but if the spread of $X$ is large compared to the mean, and the degree is low (such as quadratic), it won't be so critical (I tend to do so anyway, because it's easier to interpret the linear and quadratic).]
The underlying model is
$$E[\log Y] = \beta_0 + \beta_1 x_1 + \cdots + \beta_k x_k$$
or, in terms of error terms $\varepsilon_i,$
$$\log Y_i = \beta_0 + \beta_1 x_{1i} + \cdots + \beta_k x_{ki} + \varepsilon_i.\tag{*}$$
When we assume the conditional distribution of $\log Y$ is Normal, then the Ordinary Least Squares (OLS) estimate of $\log Y$ also is Normal, because the estimate is an affine linear combination of the errors. Suppose $\sigma^2$ is its true (but unknown) variance. Then
$$E[Y] = e^{\sigma^2/2} e^{E[\log Y]}.$$
(This is a readily-calculated property of Lognormal distributions: see Wikipedia, for instance.)
Wooldridge plugs the estimates of $\sigma^2$ and $E[\log Y]$ into this formula. As such, it can be viewed as a method of moments estimate of $E[Y].$
Although intuitively reasonable, this estimator is not necessarily the best or even a good one. For instance, it is biased: see https://stats.stackexchange.com/a/105734/919 for a discussion and a derivation of an unbiased version. Its main flaw is extreme sensitivity to the precision of the estimate $\hat \sigma^2:$ to use it reliably, you want either a great deal of data or for $\sigma^2$ to be very small.
In light of this, you may indeed consider using the estimate
$$\widehat Y = \exp\left(\widehat {E[\log Y]}\right).$$
This estimates the geometric mean of the conditional response (essentially by definition of geometric mean). In some applications it might be a better choice. After all, when you fit the logarithms of your data using OLS you were downweighting underestimates of $Y$ compared to overestimates, demonstrating you really don't want accurate estimates of $E[Y]$ itself. If you did, you would have fit the nonlinear least-squares model
$$E[Y] = \exp\left(\alpha_0+ \alpha_1 x_1 + \cdots + \alpha_k x_k\right) .$$
If you want to express the error terms $\delta_i$ explicitly, this is equivalent to
$$Y_i = e^{\alpha_0}\, \left(e^{x_{1i}}\right)^{\alpha_1}\,\cdots\,\left(e^{x_{ki}}\right)^{\alpha_k} + \delta_i.\tag{**}$$
It is instructive to compare this to the exponential of $(*)$ which asserts
$$Y_i = e^{\beta_0}\, \left(e^{x_{1i}}\right)^{\beta_1}\,\cdots\,\left(e^{x_{ki}}\right)^{\beta_k} \, e^{\varepsilon_i}.$$
Where $(*)$ posits multiplicative errors $\cdot e^{\varepsilon_i},$ $(**)$ posits additive errors $+\delta_i.$ That's the basic difference between the two models. (And, as a result, the values of the $\alpha_j$ will not equal the corresponding $\beta_j$ and their estimates will often differ, too.)
Best Answer
Actually, the interval carries over just fine. The transformation is monotonic; the probability statement that applies on the log-scale transforms directly to the original scale, so as long as the assumptions under which the original interval was computed do apply, then it works as an interval for the original population parameter after transformation.
It's the estimate that may be problematic (but may be okay, depending on what you want). Note that $E[\exp(X)]\neq \exp[E(X)]$ if $\sigma_X^2>0$. If the log-scale estimate is unbiased, the transformed estimate is biased.
If you're happy to have an estimate that's median-unbiased, then the back-transformed estimate is also okay, for the same reason that the interval works.
If you seek mean-unbiasedness there are some choices. For example, if you're prepared to assume a normal distribution on $\hat\beta$ you can unbias it by using the properties of the lognormal. Alternatively, you can use a Taylor expansion to get an approximate adjustment (details are also in a number of posts on this site). If the standard error of the estimate is small, it won't matter much. There are other things that are done.