The distributions for which this is possible are quite limited. There is a theorem of Dynkin, reported in Ferguson (1962) as Theorem 1, that states that the only one-dimensional exponential families with a location parameter have densities of the form
$$f(x)=\exp\left\{ \sum_{i=1}^m e^{\alpha_ix} p_i(x)\right\}$$
against the Lebesgue measure, where the $p_i$'s are polynomials. And there is another result of Lindley, reported in Ferguson (1962) as Theorem 2, that the only one-parameter exponential family with a location parameter is any rescaled log-Gamma distribution. And a third result of Dynkin in this paper of Ferguson (1962) is that the only location-scale exponential families with $k$ parameters are those for which $k$ is even and the log density is a polynomial in $x$ of degree $k$.
I have answered my own question. It turned out to be a rather obvious application of Bayes Rule only after making a somewhat arbitrary assumption. My question was not very clear, mostly due to my own tenuous understanding at that time.
However, this result is used quite a lot in machine learning literature involving integrating out missing variables. I am including the proof in case others find it helpful when seeing the result.
$$ P(x, y|\boldsymbol \theta) = h(x) \exp\left(\eta({\boldsymbol \theta}) . T(x, y) - A({\boldsymbol \theta}) \right) $$
By Bayes Rule,
$$
P(y|x, \theta) = \frac{ P(x|y, \theta)}{ \int_{y^{'}} P(x|{y^{'}}, \theta) P(y^{'}|\theta)d{y^{'}}}
= \frac{ P(x, y| \theta)}{ \int_{y^{'}} P(x,{y^{'}}| \theta) d{y^{'}}}
= \frac{h(x) \exp (\eta (\theta) . T(x,y) - A(\theta))}{ \int_{y^{'}} h(x) \exp (\eta (\theta) . T(x,y^{'}) - A(\theta))dy{'}}
$$
Assumed the $h(x)$ base reference measure to be a function only of $x$ so that we can cancel it from numerator and denominator in the last step above, getting
$$
\frac{\exp ( \eta(\theta).T(x,y))}{\int_{y^{'}} \exp ( \eta(\theta).T(x,y^{'}))dy^{'}} = \exp ( \eta(\theta).T(x,y) - \log(\int_{y^{'}} \exp ( \eta(\theta).T(x,y^{'}))dy^{'}) ) = \exp ( \eta(\theta).T(x,y) - A(\theta|x) )
$$
Best Answer
First, note there is a terminology problem in your title: the exponential family seems to imply one exponential family. You should say a exponential family, there are many exponential families.
Well, one consequence of your definition: $$p(\mathbf x|\boldsymbol \eta) = h(\mathbf x) g(\boldsymbol \eta) \exp \{\boldsymbol \eta^\mathrm T \mathbf u(\mathbf x)\}$$ is that the support of the distribution family indexed by parameter $\eta$ do not depend on $\eta$. (The support of a probability distribution is the (closure of) the least set with probability one, or in other words, where the distribution lives.) So it is enough to give a counterexample of a distribution family with support depending on the parameter, the most easy example is the following family of uniform distributions: $ \text{U}(0, \eta), \quad \eta > 0$. (the other answer by @Chaconne gives a more sophisticated counterexample).
Another, unrelated reason that not all distributions are exponential family, is that an exponential family distribution always have an existing moment generating function. Not all distributions have a mgf.