Note: All integrals are double integrals from negative infinity to positive infinity. Also, I'm going to use $h(x,t)$ instead of $f(x,t)$ because I want to use $f$ for frequency when I use the Fourier transform. First, write
$$
h(x,t) = \int \delta(x-x')\,\delta(t-t')\,h(x',t')\,dx'dt'
$$
[the integration with the deltas simply "picks" the value of $h(x',t')$ at $(x,t)$]
Now suppose we could find a function $G(x,t)$ such that
$$
T(x,t) = \int G(x-x',t-t')\,h(x',t')\,dx'dt'
$$
solves the heat equation. Now, from the equation we find
$$
\frac{\partial T}{\partial t} - a\,\frac{\partial^2 T}{\partial x^2} =
\int \left\{
\frac{\partial G}{\partial t}(x-x',t-t') - a\,\frac{\partial^2 G}{\partial x^2}(x-x',t-t')
\right\}
\,h(x',t')\,dx'dt'
$$
Equating that to $h(x,t)$, we get
$$
\int \left\{
\frac{\partial G}{\partial t}(x-x',t-t') - a\,\frac{\partial^2 G}{\partial x^2}(x-x',t-t')
- \delta(x-x')\,\delta(t-t')
\right\}
\,h(x',t')\,dx'dt' = 0
$$
from which it follows
$$
\frac{\partial G}{\partial t}(x-x',t-t') = a\,\frac{\partial^2 G}{\partial x^2}(x-x',t-t')
+ \delta(x-x')\,\delta(t-t')
\qquad (1)
$$
This is the equation you have for $G(x,t)$ except that I'm evaluating it at $(x-x',t-t')$. Now, how do we solve for $G(x,t)$ ? Start by writing
$$
\delta(t) = \int e^{+2\pi i\,ft}\,df
$$
$$
\delta(x) = \int e^{-2\pi i\,ux}\,du
$$
$$
G(x,t) = \int e^{-2\pi i\,(ux - ft)}\,\hat{G}(u,f)\,du\,df
$$
In the above, $f$ is the variable associated with $t$ by the FT. It's generally interpreted as a frequency. Note that $ft$ must be dimensionless. Similarly, $u$ is the variable associated with $x$ by the FT and it has to have the dimension of inverse-length. Generally it's interpreted as the inverse of the wavelength ($u = 1/\lambda$). Also, the factors of $2\pi$ can be absorbed into $f$ and $u$ by using $\omega = 2\pi f$ and $k = 2\pi u = 2\pi/\lambda$ but they end up showing up elsewhere and it's hard to keep track of them. Writing the FT this way is simpler in that sense.
Right, so now that we have the above, we can plug $G(x,t)$ into its PDE, and get
$$
\int e^{-2\pi i\,(ux - ft)}
\left\{\left[
2\pi if + a\,(2\pi u)^2
\right]\hat{G}(u,f) - 1
\right\}
du\,df = 0
$$
from which
$$
\hat{G}(u,f) = -\frac{i}{2\pi}\frac{1}{(f - 2\pi i\,au^2)}
$$
And now we can write $G(x-x',t-t')$:
$$
G(x-x',t-t') =
-\frac{i}{2\pi}\int \frac{e^{-2\pi i\,[u(x-x') - f(t-t')]}}{(f - 2\pi i\,au^2)}\,du\,df
$$
Written as it is above, it's too general. In practice, you'd have to apply boundary and initial conditions and compute the actual integral (it's also called a kernel). Then, once you have that, you have the solution to the heat equation:
$$
T(x,t) = \int G(x-x',t-t')\,h(x',t')\,dx'dt'
$$
The beauty of this method is that you now have a solution for any source function $h(x,t)$, so long as the boundary and initial conditions remain the same. The hard work is all built into the kernel. It knows about the PDE and its boundary conditions.
Edit
Based on the comment exchange below, it appears that a different approach is what was expected so I'm going to do that now. The solutions are ultimately equivalent, of course.
I'll pick up from equation (1) above, re-written as
$$
\frac{\partial G}{\partial t}(x,t) = a\,\frac{\partial^2 G}{\partial x^2}(x,t)
+ \delta(x)\,\delta(t)
\qquad (2)
$$
All I'm doing is evaluating (1) at the point $(x,t)$ instead of $(x-x', t-t')$. This is to avoid carrying $(x-x', t-t')$ all over the place. Now, how do we solve for $G(x,t)$ ? Start by writing
$$
\delta(x) = \int e^{-2\pi i\,ux}\,du
$$
$$
G(x,t) = \int e^{-2\pi i\,ux}\,\hat{g}(u,t)\,du
\qquad (3)
$$
This is still the same as before, except that $\hat{g}(u,t)$ and $\delta(t)$ haven't been expanded in a FT in time, which is what I did the first time around. Now we plug these into (2) to get
$$
\int e^{-2\pi i\,ux}\,\frac{\partial\hat{g}}{\partial t}(u,t)\,du =
a\int e^{-2\pi i\,ux}\,(-2\pi i\,u)^2\,\hat{g}(u,t)\,du
+ \delta(t)\int e^{-2\pi i\,ux}\,du
$$
or
$$
\int e^{-2\pi i\,ux} \left\{
\frac{\partial\hat{g}}{\partial t}(u,t) + 4\pi^2a\,u^2\,\hat{g}(u,t)
- \delta(t) \right\}du = 0
$$
or, since the result above has to be valid for all values of $x$,
$$
\frac{\partial \hat{g}}{\partial t}(u,t) + 4\pi^2a\,u^2\,\hat{g}(u,t) = \delta(t)
\qquad (4)
$$
Now, at $t>0$, (4) is just
$$
\frac{\partial \hat{g}}{\partial t}(u,t) + 4\pi^2a\,u^2\,\hat{g}(u,t) = 0
$$
whose solution is
$$
\hat{g}(u,t) = A(u)\exp\left(-4\pi^2a\,u^2t\right)
\qquad (5)
$$
where $A(u)$ is an arbitrary function of $u$ but not a function of $t$. How do we find $A(u)$, though? This is where the $\delta(t)$ comes in. Suppose we integrate (4) in a neighborhood of $t=0$, like so:
$$
\int_{-\varepsilon}^{+\varepsilon} \left(
\frac{\partial \hat{g}}{\partial t}(u,t) + 4\pi^2a\,u^2\,\hat{g}(u,t)
\right)dt = \int_{-\varepsilon}^{+\varepsilon} \delta(t) dt
$$
This gives us
$$
\hat{g}(u,+\varepsilon) - \hat{g}(u,-\varepsilon) + 4\pi^2a\,u^2\int_{-\varepsilon}^{+\varepsilon}\hat{g}(u,t)\,dt = 1
$$
Now, we know that the solution for $t < 0$ should vanish identically because, presumably, the source wasn't active before $t=0$, became active only at $t=0$ (hence the $\delta(t)$), then became inactive again, for $t>0$. Thus,
$$
\hat{g}(u,\varepsilon) + 4\pi^2a\,u^2\int_{-\varepsilon}^{+\varepsilon}\hat{g}(u,t)\,dt = 1
$$
Now we take the limit $\varepsilon \to 0^{+}$, to get
$$
\lim_{\varepsilon \to 0^{+}} \hat{g}(u,\varepsilon) = 1
\qquad (6)
$$
Why does the integral vanish and what does this equation mean? It means that $\hat{g}(u,t)$ has a discontinuity at $t=0$. Why? How can we see that? On physical grounds alone, it's reasonable to expect that the integral appearing above is a continuous function of $\varepsilon$ even if the integrand itself may not be a continuous function of time. Here's an analogous example. A ball is at rest somewhere and then, at $t=0$, someone kicks it really hard. The kick is like a delta function in time, since (ideally) it acts only at $t=0$. That means there is a sudden acceleration and a discontinuous change in the speed of the ball. It goes from being at rest to moving at some constant speed "instantaneously". However, despite the fact that there is a discontinuity in the speed, there is no discontinuity in the ball's position (which is the integral of the speed over time) as a function of time. The ball moves continuously from its location at $t=0$ to all its future locations. The ball doesn't instantaneously appear 10 meters away!
Now, plugging in (5) into (6) we find $A(u) = 1$ so
$$
\hat{g}(u,t) = \exp\left(-4\pi^2a\,u^2t\right)
\qquad (7)
$$
and we have $G(x,t) \equiv 0$ for $t < 0$ and
$$
G(x,t)
= \int e^{-2\pi i\,ux}\,\exp\left(-4\pi^2a\,u^2t\right)\,du
= \int \exp\left(-4\pi^2a\,u^2t - 2\pi i\,ux\right)\,du
$$
for $t>0$.
Next we complete the squares
$$
-4\pi^2a\,u^2t - 2\pi i\,ux = - 4\pi^2a\,t\,(u + B)^2 + C
$$
where $B$ and $C$ are not dependent on $u$. Expanding the RHS and collecting terms, we get
$$
8\pi^2a\,t\,B = 2\pi i\,x
\qquad\mbox{and}\qquad
C = 4\pi^2a\,t\,B^2
$$
so
$$
B = \frac{ix}{4\pi at}
\qquad\mbox{and}\qquad
C = -\frac{x^2}{4at}
$$
Thus,
$$
G(x,t)
= \int e^{- 4\pi^2a\,t\,(u + B)^2 + C}\,du
= e^{-\frac{x^2}{4at}} \int e^{- 4\pi^2a\,t\,(u + B)^2}\,du
$$
for $t>0$. The left-over integral is just a Gaussian integral (recall that all the integrals in my answer are assumed to be from $-\infty$ to $+\infty$, unless otherwise indicated) so
$$
G(x,t)
= e^{-\frac{x^2}{4at}} \sqrt{\frac{\pi}{4\pi^2a\,t}}
= \frac{e^{-\frac{x^2}{4at}}}{\sqrt{4\pi at}}
$$
In summary,
$$
G(x,t) \equiv 0
\qquad t < 0
\qquad\mbox{and}\qquad
G(x,t) = \frac{e^{-\frac{x^2}{4at}}}{\sqrt{4\pi at}}
\qquad t > 0
\qquad (8)
$$
From this point on, we can get back to my previous answer and obtain the general solution of the heat equation for an arbitrary $h(x,t)$, just as before.
Edit
Following a question by the OP, here's the remainder of the solution, in more detail. Using (8) into the solution for $T(x,t)$ from my original answer, we have:
$$
T(x,t)
= \int_{-\infty}^{+\infty} dt' \int_{-\infty}^{+\infty} dx' \,G(x-x',t-t')\,h(x',t')
= \int_{-\infty}^{\,t} dt' \int_{-\infty}^{+\infty} dx' \,\frac{e^{-\frac{(x-x')^2}{4a\,(t-t')}}}{\sqrt{4\pi a\,(t-t')}}\,h(x',t')
$$
Note the change in the upper limit of the time integral. That's because $G(x,t) \equiv 0$ for $t < 0$, that is, $G(x-x',t-t') \equiv 0$ for $t-t' < 0$ or, equivalently, for $t'>t$, so the integral "ends" at $t'=t$. We can take the denominator out from the $x'$ integral since it doesn't depend on $x'$ and write
$$
T(x,t)
= \int_{-\infty}^{\,t} \frac{dt'}{\sqrt{4\pi a\,(t-t')}}
\int_{-\infty}^{+\infty} dx'\,h(x',t')\,e^{-\frac{(x-x')^2}{4a\,(t-t')}}
$$
Best Answer
I'm not aware of a good reference for the formula I gave; here's an explanation instead, (albeit a somewhat hand-waving one).
If integrals like the one you gave are well defined, multiplying by $\delta(x_i)$ where $x_i$ is a coordinate ought be the same as setting those coordinates to zero: $$ \int_{U\subseteq\mathbb{R}^3}f(x_1,x_2,x_3)\delta(x_2)\delta(x_3)dx_1dx_2dx_3=\int_{\{x_1\in\mathbb{R}:(x_1,0,0)\in U\}}f(x_1,0,0)dx_1 $$ and so on for other dimensions and other numbers of deltas. Additionally, we would expect such integrals to still obey the change of variables formula: $$ \int_Vf(\mathbf{x})d\mathbf{x}=\int_Uf(\varphi(\mathbf{y}))|\det(D\varphi(\mathbf{y}))|d\mathbf{y} $$ Where $\varphi:U\to V$ is a diffeomorphism between open subsets of $\mathbb{R}^n$.
If we assume that the zero sets of $g$ and $h$ are transverse (i.e. $\nabla g$ and $\nabla h$ are linearly independent on $g^{-1}(0)\cap h^{-1}(0)$), then this is already enough. Let $\mathbf{p}\in g^{-1}(0)\cap h^{-1}(0)$. By inverse function theorem there exist a set of local coordinates $\mathbf{y}:V\to U$ on a neighborhood $V$ of $\mathbf{p}$ such that $\mathbf{y}(p)=\mathbf{0}$, $y_3=g$, and $y_4=h$. Let $\varphi:U\to V$ be the inverse of these coordinates. Applying the change of variables formula gives $$ \int_Vf(\mathbf{x})\delta(g(\mathbf{x}))\delta(h(\mathbf{x}))d\mathbf{x}=\int_Uf(\varphi(\mathbf{y}))\delta(g(\varphi(\mathbf{y} ))\delta(h(\varphi(\mathbf{y}))|\det(D\varphi(\mathbf{y})|d\mathbf{y} $$ Since $\varphi$ is inverse to $\mathbf{y}$, we can rewrite this as $$ =\int_U\frac{f(\varphi(\mathbf{y}))\delta(y_3)\delta(y_4)}{|\det(D\mathbf{y}(\varphi(\mathbf{y})))|}d\mathbf{y} $$ Now that the deltas are composed with coordinates, we can restrict to a surface integral. $$ =\int_{U\cap\mathbb{R}^2}\frac{f(\varphi(y_1,y_2,0,0))}{|\det(D\mathbf{y}(\varphi(y_1,y_2,0,0)))|}dy_1dy_2 $$ The columns of $D\mathbf{y}$ are just the gradients of the coordinates. $$ =\int_{U\cap\mathbb{R}^2}\frac{f(\varphi(\mathbf{y}))\delta(y_3)\delta(y_4)}{|\det([\nabla y_1,\nabla y_2,\nabla g,\nabla h])|(\varphi(y_1,y_2,0,0))}dy_1dy_2 $$ Let $\pi$ be the orthogonal projection onto $T(g^{-1}(0)\cap h^{-1}(0))$. Since this amounts to subtracting multiples of $\nabla g$ and $\nabla h$, we can apply it to $\nabla y_1$ and $\nabla y_2$ without changing the determinant: $$ =\int_{U\cap\mathbb{R}^2}\frac{f(\varphi(y_1,y_2,0,0))}{|\det([\pi(\nabla y_1),\pi(\nabla y_2),\nabla g,\nabla h])|(\varphi(y_1,y_2,0,0))}dy_1dy_2 $$ To split the determinant, note that $|\det([a,b,c,d])|$ is equal to the volume of the parallelepiped spanned by $a,b,c,d$. If $\operatorname{span}(a,b)\perp\operatorname{span}(c,d)$, then this is equal to the product of areas the parallelograms spanned by $(a,b)$ and $(c,d)$. In terms of wedge products, this gives $$ =\int_{U\cap\mathbb{R}^2}\frac{f(\varphi(y_1,y_2,0,0))}{\|\nabla g\wedge\nabla h\|(\varphi(y_1,y_2,0,0))}\frac{dy_1dy_2}{\|\pi(\nabla y_1)\wedge\pi(\nabla y_2)\|(\varphi(y_1,y_2,0,0))} $$ Where the last term is just the standard area element of $g^{-1}(0)\cap h^{-1}(0)$ in the coordinates $y_1,y_2$. Writing this in a more coordinate-independent way, we (almost) have the desired formula. $$ =\int_{V\cap(g^{-1}(0)\cap h^{-1}(0))}\frac{f}{\|\nabla g\wedge\nabla h\|}dA $$ This result is only local, but you can recover the global version using a partition of unity.