The main distinction you want to make is between the Green function and the kernel. (I prefer the terminology "Green function" without the 's. Imagine a different name, say, Feynman. People would definitely say the Feynman function, not the Feynman's function. But I digress...)
Start with a differential operator, call it $L$. E.g., in the case of Laplace's equation, then $L$ is the Laplacian $L = \nabla^2$. Then, the Green function of $L$ is the solution of the inhomogenous differential equation
$$
L_x G(x, x^\prime) = \delta(x - x^\prime)\,.
$$
We'll talk about its boundary conditions later on. The kernel is a solution of the homogeneous equation
$$
L_x K(x, x^\prime) = 0\,,
$$
subject to a Dirichlet boundary condition $\lim_{x \rightarrow x^\prime}K(x,x^\prime) = \delta (x-x^\prime)$, or Neumann boundary condition $\lim_{x \rightarrow x^\prime} \partial K(x,x^\prime) = \delta(x-x^\prime)$.
So, how do we use them? The Green function solves linear differential equations with driving terms. $L_x u(x) = \rho(x)$ is solved by
$$
u(x) = \int G(x,x^\prime)\rho(x^\prime)dx^\prime\,.
$$
Whichever boundary conditions we what to impose on the solution $u$ specify the boundary conditions we impose on $G$. For example, a retarded Green function propagates influence strictly forward in time, so that $G(x,x^\prime) = 0$ whenever $x^0 < x^{\prime\,0}$. (The 0 here denotes the time coordinate.) One would use this if the boundary condition on $u$ was that $u(x) = 0$ far in the past, before the source term $\rho$ "turns on."
The kernel solves boundary value problems. Say we're solving the equation $L_x u(x) = 0$ on a manifold $M$, and specify $u$ on the boundary $\partial M$ to be $v$. Then,
$$
u(x) = \int_{\partial M} K(x,x^\prime)v(x^\prime)dx^\prime\,.
$$
In this case, we're using the kernel with Dirichlet boundary conditions.
For example, the heat kernel is the kernel of the heat equation, in which
$$
L = \frac{\partial}{\partial t} - \nabla_{R^d}^2\,.
$$
We can see that
$$
K(x,t; x^\prime, t^\prime) = \frac{1}{[4\pi (t-t^\prime)]^{d/2}}\,e^{-|x-x^\prime|^2/4(t-t^\prime)},
$$
solves $L_{x,t} K(x,t;x^\prime,t^\prime) = 0$ and moreover satisfies
$$
\lim_{t \rightarrow t^\prime} \, K(x,t;x^\prime,t^\prime) = \delta^{(d)}(x-x^\prime)\,.
$$
(We must be careful to consider only $t > t^\prime$ and hence also take a directional limit.) Say you're given some shape $v(x)$ at time $0$ and want to "melt" is according to the heat equation. Then later on, this shape has become
$$
u(x,t) = \int_{R^d} K(x,t;x^\prime,0)v(x^\prime)d^dx^\prime\,.
$$
So in this case, the boundary was the time-slice at $t^\prime = 0$.
Now for the rest of them. Propagator is sometimes used to mean Green function, sometimes used to mean kernel. The Klein-Gordon propagator is a Green function, because it satisfies $L_x D(x,x^\prime) = \delta(x-x^\prime)$ for $L_x = \partial_x^2 + m^2$. The boundary conditions specify the difference between the retarded, advanced and Feynman propagators. (See? Not Feynman's propagator) In the case of a Klein-Gordon field, the retarded propagator is defined as
$$
D_R(x,x^\prime) = \Theta(x^0 - x^{\prime\,0})\,\langle0| \varphi(x) \varphi(x^\prime) |0\rangle\,
$$
where $\Theta(x) = 1$ for $x > 0$ and $= 0$ otherwise. The Wightman function is defined as
$$
W(x,x^\prime) = \langle0| \varphi(x) \varphi(x^\prime) |0\rangle\,,
$$
i.e. without the time ordering constraint. But guess what? It solves $L_x W(x,x^\prime) = 0$. It's a kernel. The difference is that $\Theta$ out front, which becomes a Dirac $\delta$ upon taking one time derivative. If one uses the kernel with Neumann boundary conditions on a time-slice boundary, the relationship
$$
G_R(x,x^\prime) = \Theta(x^0 - x^{\prime\,0}) K(x,x^\prime)
$$
is general.
In quantum mechanics, the evolution operator
$$
U(x,t; x^\prime, t^\prime) = \langle x | e^{-i (t-t^\prime) \hat{H}} | x^\prime \rangle
$$
is a kernel. It solves the Schroedinger equation and equals $\delta(x - x^\prime)$ for $t = t^\prime$. People sometimes call it the propagator. It can also be written in path integral form.
Linear response and impulse response functions are Green functions.
These are all two-point correlation functions. "Two-point" because they're all functions of two points in space(time). In quantum field theory, statistical field theory, etc. one can also consider correlation functions with more field insertions/random variables. That's where the real work begins!
Looking at it, I would say it is the energy caused by turning on positional correlations.
Let's say that there aren't spatial correlations. Then the no correlation energy is $E_{n.c.} = \frac{1}{2} \int n V(r) dV$, where the quantities are defined as in your question.
Now lets suppose we put an electron on the origin and turn on correlations. The density of the other electrons will no longer be uniform because of attraction and repulsion with the electron at the origin. But by symmetry, we expect the electron density function to depend only on the distance from this electron we have put in. One can express this radial dependence with a function $g(r)$ which has the following properties: $g(r) \to 1$ as $r \to \infty$, and $g(r)$ is proportional to the electron density. Another way of saying this is that $g(r)$ is the electron density perturbed by correlations, normalized by the unperturbed density (since the perturbed and unperturbed densities should be equal at infinity where no one knows about the electron at the origin). With this in place, the correlated electron density is $\rho_c(r) = n g(r)$. The energy of this electron configuration is $E_c = \frac{1}{2} \int \rho_c(r) V(r) dV = \frac{1}{2} \int n g(r) V(r) dV$.
Now lets find the energy from turning on positional correlation. After turning on correlations, there is an energy $E_c$, but before turning on correlations, there was an energy $E_{n.c.}$. Thus there was an energy difference $E_c - E_{n.c.} =\frac{1}{2} \int n g(r) V(r) dV - \frac{1}{2} \int n V(r) dV = \frac{n}{2} \int (g(r) -1)V(r)dV $. This is just a guess though. Tell me if it makes sense.
Best Answer
The $g(\mathbf r_1, \mathbf r_2)$ is defined as
$$g(\mathbf{r}) = \frac{\rho^{(2)}(\mathbf{r}_1,\mathbf{r}_2)}{\rho^{(1)}(\mathbf{r}_1) \rho^{(1)}(\mathbf{r}_2)}$$
where
$$\rho^{(n)} (\mathbf r_1, \dots, \mathbf r_n) = \frac{N!}{(N-n)!} \frac 1 Z \int e^{-\beta V} d \mathbf r^{(N-n)}$$
If the system is homogeneous,
$$\rho^{(1)} (\mathbf r) = \rho \ \ \ \ \text{(bulk density)}$$
so that
$$g(\mathbf{r}) = \frac{\rho^{(2)}(\mathbf{r}_1,\mathbf{r}_2)}{\rho^2}$$
and if the system is also isotropic,
$$g(\mathbf r_1, \mathbf r_2) = g(\mid \mathbf r_1-\mathbf r_2\mid) = g(r)$$
So we can interpret $g(r)$ as the probability to find a particle in a volume $d \mathbf r$ around a chosen particle, and $g(r) \rho d \mathbf r$ as the average number of particles in the volume $d \mathbf r$.
Now, it can be shown that
$$\rho g(\mathbf r) = \frac 1 N \langle \sum_{i\neq j} \delta (\mathbf r + \mathbf r_i - \mathbf r_j) \rangle $$
The structure factor is defined as
$$S(\mathbf k) = \frac 1 N \langle \sum_{i,j} e^{-i \mathbf k \cdot (\mathbf r_i - \mathbf r_j)} \rangle$$
so that you have
$$S(\mathbf k) = 1+\frac 1 N \langle \sum_{i\neq j} \int d \mathbf r e^{-i \mathbf k \cdot \mathbf r} \delta (\mathbf r + \mathbf r_i - \mathbf r_j) \rangle = 1+\rho \int d \mathbf r e^{-i \mathbf k \cdot \mathbf r} g(r) = 1 + \rho \tilde g(\mathbf k)$$
where $\tilde {(.)}$ is the Fourier transform.
$h$ is defined as
$$h(r)=g(r)-1$$
So that
$$S(\mathbf k) = 1 + \rho \tilde h(\mathbf k) + (2 \pi)^3 \rho \delta(\mathbf k)$$
which for $\mathbf k \neq \mathbf 0$ becomes
$$S(\mathbf k) = 1 + \rho \tilde h(\mathbf k)$$
For a more complete exposition, I would suggest Theory of Simple Liquids by Hansen and McDonald.
The confusion arises from the fact that, while $h(r)$ is a dimensionless quantity, its Fourier transform $\tilde h(\mathbf k)$ is not: it has the dimension of a volume.