Most of this is standard theory of path integrals known to mathematical physicists so I will try to address all of your questions.
First let me say that the hypothesis you list for the action $S$ to make the path integral well defined, ie that $S=Q+V$ where $Q$ is quadratic and non-degenerate and $V$ is bounded are extremely restrictive. One should think of $V$ as defining the potential energy for interactions of the physical system and while it certainly true that one expects this to be bounded below, there are very few physical systems where this is also bounded above (this is also true for interesting mathematical applications...). Essentially requiring that the potential be bounded implies that the asymptotic behavior of $S$ in the configuration space is totally controlled by the quadratic piece. Since path integrals with quadratic actions are trivial to define and evaluate, it is not really that surprising or interesting that by bounding the potential one can make the integral well behaved.
Next you ask if anyone has studied the question of when an action $S$ gives rise to a well defined path integral: $ \int \mathcal{D}f \ e^{-S[f(x)]}$
The answer of course is yes. The people who come to mind first are Glimm and Jaffe who have made whole careers studying this issue. In all cases of interest $S$ is an integral $S=\int L$ where the integral is over your spacetime manifold $M$ (in the simplest case $\mathbb{R}^{n}$) and the problem is to constrain $L$. The problem remains unsolved but nevertheless there are some existence proofs. The basic example is a scalar field theory, ie we are trying to integrate over a space of maps $\phi: M \rightarrow \mathbb{R}$. We take an $L$ of the form:
$ L = -\phi\Delta \phi +P(\phi)$
Where in the above $\Delta$ is the Laplacian, and $P$ a polynomial. The main nontrivial result is then that if $M$ is three dimensional, and $P$ is bounded below with degree less than seven then the functional integral exists rigorously. Extending this analysis to the case where $M$ has dimension four is a major unsolved problem.
Moving on to your next point, you ask about another approach to path integrals called perturbation theory. The typical example here is when the action is of the form $S= Q+\lambda V$ where $Q$ is quadratic, $V$ is not, and $\lambda$ is a parameter. We attempt a series expansion in $\lambda$. The first thing to say here, and this is very important, is that in doing this expansion I am not attempting to define the functional integral by its series expansion, rather I am attempting to approximate it by a series. Let me give an example of the difference. Consider the following function $f(\lambda)$:
$f(\lambda)=\int_{-\infty}^{\infty}dx \ e^{-x^{2}-\lambda x^{4}}$
The function $f$ is manifestly non-analytic in $\lambda$ at $\lambda=0$. Indeed if $\lambda<0$ the integral diverges, while if $\lambda \geq 0$ the integral converges. Nevertheless we can still be rash and attempt to define a series expansion of $f$ in powers of $\lambda$ by expanding the exponential and then interchanging the order of summation and integration (illegal to be sure!). We arrive at a formal series:
$s(\lambda)=\sum_{n=0}^{\infty}\frac{\lambda^{n}}{n!}\int_{-\infty}^{\infty}dx \ e^{-x^{2}}(-x^{4})^{n}$
Of course this series diverges. However this expansion was not in vain. $s(\lambda)$ is a basic example of an asymptotic series. For small $\lambda$ truncating the series at finite order less than $\frac{1}{\lambda^{2}}$ gives an excellent approximation to the function $f(\lambda)$
Returning to the example of Feynman integrals, the first point is that the perturbation expansion in $\lambda$ is an asymptotic series not a Taylor series. Thus just as for $s(\lambda)$ it is misguided to ask if the series converges...we already know that it does not! A better question is to ask for which actions $S$ this approximation scheme of perturbation theory itself exists. On this issue there is a complete and rigorous answer worked out by mathematical physicists in the late 70s and 80s called renormalization theory. A good reference is the book by Collins "Renormalization." Connes and Kreimer have not added new results here; rather they have given modern proofs of these results using Hopf algebras etc.
Finally I will hopefully answer some of your questions about Chern-Simons theory. The basic point is that Chern-Simons theory is a topological field theory. This means that it suffers from none of the difficulties of usual path integrals. In particular all quantities we want to compute can be reduced to finite dimensional integrals which are of course well defined. Of course since we lack an independent definition of the Feynman integral over the space of connections, the argument demonstrating that it reduces to a finite dimensional integral is purely formal. However we can simply take the finite dimensional integrals as the definition of the theory. A good expository account of this work can be found in the recent paper of Beasley "Localization for Wilson Loops in Chern-Simons Theory."
Overall I would say that by far the currently most developed approach to studying path integrals rigorously is that of discretization. One approximates spacetime by a lattice of points and the path integral by a regular integral at each lattice site. The hard step is to prove that the limit as the lattice spacing $ a $ goes to zero, the so-called continuum limit, exists. This is a very hard analysis problem. Glimm and Jaffe succeeded in using this method to construct the examples I mentioned above, but their arguments appear limited. Schematically when we take the limit of zero lattice size we also need to take a limit of our action, in other words the action should be a function of $ a$. We now write $S(a)=Q+\lambda V+H(a,\lambda)$ Where as usual $Q$ is quadratic $V$ is not an $\lambda$ is a parameter. Our original action is $S=Q +\lambda V$
The question is then can we find an $H(a,\lambda)$ such that a suitable $a\rightarrow 0$ limit exists? A priori one could try any $H$ however the arguments of Glimm and Jaffe are limited to the case where $H$ is polynomial in $\lambda$. Physically this means that the theory is very insensitive to short distance effects, in other words one could modify the interactions slightly at short distances and one would find essentially the same long distance physics. It seems that new methods are needed to generalize to a larger class of continuum limits.
The most direct way to get a small time expansion is using a gradient expansion. This can be worked out in the following way. Rescale your time variable as $t\rightarrow\lambda t$ being $\lambda$ a parameter taken to be arbitrary large and introduced to fix expansion order. Then, takes $\phi$ proportional to $\lambda$. You will get
$$\lambda u_\tau=\alpha(x,\frac{\tau}{\lambda},u)u_{xx}+\beta(x,\frac{\tau}{\lambda},u)(u_x)^2+\gamma(x,\frac{\tau}{\lambda},u)u_x+\lambda\phi(x,\frac{\tau}{\lambda},u).$$
Then, expand $u$ as
$$u(x,\tau)=\sum_{n=0}^\infty\frac{1}{\lambda^n}u_n(x,\tau).$$
At the end of the computation just put $\lambda=1$. Your solution will be given as polynomials in $t$.
Now, let us consider your equation
$$tu_t=Tr[\frac{u}{2}AD(\frac{x}{u})\otimes D(\frac{x}{u})]-t^2Tr[\frac{u^3}{8}ADu\otimes Du]+t<\alpha,Du>$$
$$+t Tr[\frac{1}{2}AD^2u]-\frac{u}{2}.$$
We firstly note that $D(\frac{x}{u})=\frac{e_x}{u}-x\frac{Du}{u^2}$ and so we can write down this equation in the form
$$tu_t=\frac{1}{2u}[A_{xx}+x^2Tr[A Du\otimes Du]]-t^2Tr[\frac{u^3}{8}ADu\otimes Du]+t<\alpha,Du>$$
$$+t Tr[\frac{1}{2}AD^2u]-\frac{u}{2}.$$
I hope I have interpreted correctly your notation. Now, multiply by $u$ both sides and you will get
$$tuu_t=\frac{1}{2}[A_{xx}+x^2Tr[A Du\otimes Du]]-t^2uTr[\frac{u^3}{8}ADu\otimes Du]+t<\alpha,Du>$$
$$+t uTr[\frac{1}{2}AD^2u]-\frac{u^2}{2}.$$
Now, change the variable as $\tau=\lambda t$. You will get the new equation
$$\tau uu_\tau=\frac{1}{2}[A_{xx}+x^2Tr[A Du\otimes Du]]-\frac{1}{\lambda^2}\tau^2uTr[\frac{u^3}{8}ADu\otimes Du]+\frac{1}{\lambda}\tau <\alpha,Du>$$
$$+\frac{1}{\lambda}\tau uTr[\frac{1}{2}AD^2u]-\frac{u^2}{2}.$$
From this you can read off immediately the leading order being
$$\tau u_0u_{0\tau}=\frac{1}{2}[A_{xx}+x^2Tr[A Du_0\otimes Du_0]]-\frac{u_0^2}{2}.$$
This is a Hamilton-Jacobi equation that can be solved by the characteristic method. Higher orders can be obtained straightforwardly as an expansion in $\frac{1}{\lambda}$.
We can spend a few words about the next-to-leading order by substituting into the scaled equation
$$u=u_0+\frac{1}{\lambda}u_1+O\left(\frac{1}{\lambda^2}\right)$$
Then, the equation becomes
$$\tau (u_0+\frac{1}{\lambda}u_1)(u_{0\tau}+\frac{1}{\lambda}u_{1\tau})=\frac{1}{2}[A_{xx}+x^2Tr[A D(u_0+\frac{1}{\lambda}u_1)\otimes D(u_0+\frac{1}{\lambda}u_1)]]-$$
$$\frac{1}{\lambda^2}\tau^2(u_0+\frac{1}{\lambda}u_1)Tr[\frac{(u_0+\frac{1}{\lambda}u_1)^3}{8}AD(u_0+\frac{1}{\lambda}u_1)\otimes D(u_0+\frac{1}{\lambda}u_1)]$$
$$+\frac{1}{\lambda}\tau <\alpha,D(u_0+\frac{1}{\lambda}u_1)>$$
$$+\frac{1}{\lambda}\tau (u_0+\frac{1}{\lambda}u_1)Tr[\frac{1}{2}AD^2(u_0+\frac{1}{\lambda}u_1)]-\frac{1}{2}\left(u_0+\frac{1}{\lambda}u_1\right)^2+O\left(\frac{1}{\lambda^2}\right).$$
So, the equation to compute $u_1$ is
$$\tau (u_0u_{1\tau}+u_1u_{0\tau})=x^2Tr[A Du_0\otimes Du_1]$$
$$-\tau <\alpha,Du_0>+\tau u_0Tr[\frac{1}{2}AD^2u_0]-u_0u_1.$$
One can repeat this procedure at any order and, finally, gets the solution into the form given at the beginning of this post.
Best Answer
As a physicist "in nature" perhaps I can give a few examples that illustrate how non-analytic functions can appear in physics and counter the idea that physicists do not worry about the justification of these procedures.
Example 1 involves one of the most precise comparisons between experiment and theory known to physics, namely the g factor of the electron. The quantity g is a proportionality factor between the spin of the electron and its magnetic moment. Perturbation theory in QED gives a formula $$g-2= c_1 \alpha + c_2 \alpha^2 + c_3 \alpha^3 + \cdots $$ where the coefficients $c_i$ can be computed from i-loop Feynman diagrams and $\alpha=e^2/\hbar c \simeq 1/137$ is the fine structure constant. Including up to four loop diagrams gives an expression for $g$ which agrees to one part in $10^{8}$ with experiment. Yet it is known that that this perturbative series has zero radius of convergence. This is true quite generally in quantum field theory. Physicists do not ignore this, rather they regard it as evidence that QFT's are not defined by their perturbation series but must also include non-perturbative effects, generally of the form $e^{-c/g^2}$ with $g$ a dimensionless coupling constant. Much effort has gone into understanding these non-perturbative effects in a variety of quantum field theories. Instanton effects in non-Abelian gauge theory are an important example of non-perturbative phenomena.
Example 2 involves the Hydrogen atom in an electric field of magnitude $E$, aka the Stark effect. One can compute the shift in the energy eigenvalues of the Hydrogen atom Hamiltonian due to the applied electric field as a power series in $E$ using perturbation theory and again one finds excellent agreement with experiment. One can also prove that this series has zero radius of convergence. In fact, the Hamiltonian is not bounded from below and does not have any normalizable energy eigenstates. The physics of this situation explains what is going on. The electron can tunnel through the potential barrier and escape from being bound to the nucleus of the Hydrogen atom, but for reasonable size electric fields the lifetime of these states exceeds the age of the universe. The perturbation theory does not converge because there are no energy eigenstates to converge to, but it still provides an excellent approximation to the energy eigenstates measured experimentally because the experiments are done on a time scale which is very short compared to the lifetime of the metastable state.
So I would say that at least in these examples there is a very nice interplay between the physics and the mathematics. The lack of analyticity has a clear physical interpretation and this is something that is understood by physicists. Of course I'm sure there are other example where such approximations are made without a clear physical justification, but this just means that one should understand the physics better.