In this answer, we take the point of view that the GEM equations are not a first principle by themselves but can only be justified via an appropriate limit (to be determined) of the linearized EFE$^1$ in 3+1D
$$ \begin{align}
\kappa T^{\mu\nu}~\stackrel{\text{EFE}}{=}~& G^{\mu\nu}\cr
~=~&-\frac{1}{2}\left(\Box \bar{h}^{\mu\nu}
+ \eta^{\mu\nu} \partial_{\rho}\partial_{\sigma} \bar{h}^{\rho\sigma}
- \partial^{\mu}\partial_{\rho} \bar{h}^{\rho\nu}
- \partial^{\nu}\partial_{\rho} \bar{h}^{\rho\mu} \right)
,\cr
\kappa~\equiv~&\frac{8\pi G}{c^4},
\end{align}\tag{1}$$
where
$$ \begin{align}g_{\mu\nu}~&=~\eta_{\mu\nu}+h_{\mu\nu}, \cr
\bar{h}_{\mu\nu}~:=~h_{\mu\nu}-\frac{1}{2}\eta_{\mu\nu}h
\qquad&\Leftrightarrow\qquad
h_{\mu\nu}~=~\bar{h}_{\mu\nu}-\frac{1}{2}\eta_{\mu\nu}\bar{h}. \end{align} \tag{2}$$
There may be other approaches that we are unaware of, but reading Ref. 1, the pertinent GEM limit seems to be of E&M static nature, thereby seemingly excluding gravitational waves/radiation.
Concretely, the matter is assumed to be dust:$^2$
$$ T^{\mu 0}~=~cj^{\mu}, \qquad j^{\mu}~=~\begin{bmatrix} c\rho \cr {\bf J} \end{bmatrix}, \qquad T^{ij}~=~{\cal O}(c^0). \tag{3}$$
The only way to systematically implement a dominanting temporal sector/static limit seems to be by going to the Lorenz gauge$^3$
$$\partial_{\mu} \bar{h}^{\mu\nu} ~=~0. \tag{4}$$
Then the linearized EFE (1) simplifies to
$$ G^{\mu\nu}~=~-\frac{1}{2}\Box \bar{h}^{\mu\nu}~=~\kappa T^{\mu\nu}. \tag{5}$$
In our convention, the GEM ansatz reads$^5$
$$\begin{align} A^{\mu}~=~&\begin{bmatrix} \phi/c \cr {\bf A} \end{bmatrix}, \qquad\bar{h}^{ij}~=~{\cal O}(c^{-4}),\cr
-\frac{1}{4}\bar{h}^{\mu\nu} ~=~&\begin{bmatrix} \phi/c^2 & {\bf A}^T /c\cr {\bf A}/c & {\cal O}(c^{-4})\end{bmatrix}_{4\times 4}\cr
~\Updownarrow~& \cr
-h^{\mu\nu} ~=~&\begin{bmatrix} 2\phi/c^2 & 4{\bf A}^T/c \cr 4{\bf A}/c & (2\phi/c^2){\bf 1}_{3\times 3}\end{bmatrix}_{4\times 4} \cr
~\Updownarrow~& \cr
g_{\mu\nu} ~=~&\begin{bmatrix} -1-2\phi/c^2 & 4{\bf A}^T/c \cr 4{\bf A}/c & (1-2\phi/c^2){\bf 1}_{3\times 3}\end{bmatrix}_{4\times 4}. \end{align}\tag{6}$$
The gravitational Lorenz gauge (4) corresponds to the Lorenz gauge condition
$$ c^{-2}\partial_t\phi + \nabla\cdot {\bf A}~\equiv~ \partial_{\mu}A^{\mu}~=~0 \tag{7}$$
and the "electrostatic limit"$^4$
$$ \partial_t {\bf A}~=~{\cal O}(c^{-2}).\tag{8}$$
Next define the field strength
$$\begin{align}
F_{\mu\nu}~:=~&\partial_{\mu} A_{\nu} - \partial_{\nu} A_{\mu}, \cr
-{\bf E}~:=~&{\bf \nabla} \phi+\partial_t{\bf A}, \cr
{\bf B}~:=~&{\bf \nabla}\times {\bf A}.\end{align} \tag{9} $$
Then the tempotemporal & the spatiotemporal sectors of the linearized EFE (1) become the gravitational Maxwell equations with sources
$$ \partial_{\mu} F^{\mu\nu}~=~\frac{4\pi G}{c}j^{\mu}. \tag{10} $$
Note that the gravitational (electric) field ${\bf E}$ should be inwards (outwards) for a positive mass (charge), respectively. For this reason, in this answer/Wikipedia, the GEM equations (10) and the Maxwell equations have opposite$^5$ signs.
Interestingly, a gravitational gauge transformation of the form
$$\begin{align}\delta h_{\mu\nu}~=~&\partial_{\mu}\varepsilon_{\nu}+(\mu\leftrightarrow\nu), \cr
\varepsilon_{\nu}~:=~&c^{-1}\delta^0_{\nu}~\varepsilon, \end{align}\tag{11} $$ leads to
$$\delta h~=~-2c^{-1}\partial_0\varepsilon \tag{12}$$
and thereby to the usual gauge transformations
$$\delta A_{\mu}~=~\partial_{\mu}\varepsilon.\tag{13}$$
Such gauge transformations (13) preserve the GEM eqs. (10) but violate the GEM ansatz $\bar{h}^{ij}={\cal O}(c^{-4})$ unless
$$\partial_t\varepsilon~=~{\cal O}(c^{-2}).\tag{14}$$
In conclusion, the Lorenz gauge condition (7) is not necessary, but we seem to be stuck with the "electrostatic limit" (8).
References:
- B. Mashhoon, Gravitoelectromagnetism: A Brief Review, arXiv:gr-qc/0311030.
--
$^1$ In this answer we use Minkowski sign convention $(-,+,+,+)$ and work in the SI-system. Space-indices $i,j,\ldots \in\{1,2,3\}$ are Roman letters, while spacetime indices $\mu,\nu,\ldots \in\{0,1,2,3\}$ are Greek letters.
$^2$ Warning: The $j^{\mu}$ current (3) does not transform covariantly under Lorentz boosts. The non-inertial frames that Wikipedia mentions are presumably because the $g_{\mu\nu}$-metric (2) is non-Minkowskian.
$^3$ The Lorenz gauge (4) is the linearized de Donder/harmonic gauge
$$ \partial_{\mu}(\sqrt{|g|} g^{\mu\nu})~=~0.\tag{15}$$
$^4$ We unconventionally call eq. (8) the "electrostatic limit" since the term $\partial_t{\bf A}$ enters the definition (9) of ${\bf E}$.
$^5$ Warning: In Mashhoon (Ref. 1) the GEM equations (10) and the Maxwell equations have the same sign. For comparison, in this Phys.SE answer
$$\phi~=~-\phi^{\text{Mashhoon}}, \qquad
{\bf E}~=~-{\bf E}^{\text{Mashhoon}}, $$
$${\bf A}~=~-\frac{1}{2c}{\bf A}^{\text{Mashhoon}}, \qquad
{\bf B}~=~-\frac{1}{2c}{\bf B}^{\text{Mashhoon}}.\tag{16}$$
The Newtonian vacuum field equation $\nabla^2 \phi = \rho$ where $\phi$ is the gravitational potential and $\rho$ is proportional to mass density also has non-trivial vacuum solutions, for example $\phi = -1/r$ for $r$ outside some spherical surface. The Maxwell equations also have non-trivial solutions. In electrostatics, precisely the same as classical gravitation, in elctrodynamics, also radiative solutions of various kinds.
It is not strange that a field theory has non-trivial vacuum solutions. From a mathematical point of view, if it did not, it would not be possible to solve boundary value problems otherwise. Physically, a (local) field theory is supposed to provide a way for spatially separated matter to interact without spooky action at a distance. If interactions were unable to propagate through a region of vacuum we would have a very boring field theory!
If we want to be a little more specific to general relativity, let us note that this theory actually consists of two field equations. The most famous one is Einstein's, $$
R_{\mu\nu} = 8\pi T_{\mu\nu}
$$
which says that matter is the source for the field $\Gamma^{\mu}{}_{\nu\sigma}$ -- the Christoffel symbols. This equation alone does not contain the fundamental characterization of general relativity. It is just an equation for some field. For this field to actually correspond to the curvature it must also satisfy the Bianchi identity $$
R_{\mu\nu[\sigma\tau;\rho]} = 0.
$$
The Bianchi identity is redundant if the Christoffel symbols are defined the way they are in terms of the metric. This is actually analogous with electrodynamics (and for a very good reason, because ED is also a theory of curvature). The Maxwell equations are $$F^{\mu\nu}{}_{,\nu} = j^\mu $$ $$F_{[\mu\nu,\sigma]} = 0$$
and the first equation is the one that couples the electromagnetic field to matter. The second equation is redundant if $F_{\mu\nu}$ is defined in terms of the vector potential.
Now, the electromagnetic field has 6 components but as you can see only 4 of them really couple to matter directly. The second equation represents the freedom for the electromagnetic field to propagate in vacuum. (In fact if you do Fourier analysis to find radiation solutions to the Maxwell equations the first only tells you that radiation is transverse, and the second is the one that actually determines the radiation.) The components are naturally not independent since matter and radiation interact, but I think that this is a nice way to think about why of the classical Maxwell equations $$\begin{matrix}
\nabla \cdot \mathbf{E} = & \sigma\\
\nabla \cdot \mathbf{B} = & 0 \\
\nabla \times \mathbf{E} = & -\frac{\partial \mathbf B}{\partial t} \\
\nabla \times \mathbf{B} = & \mathbf{j} + \frac{\partial \mathbf E}{\partial t}
\end{matrix}$$
two involve only the fields and two involve matter.
Similarly for Einstein's general relativity, in the Einstein field equation $$R_{\mu\nu} = 8\pi T_{\mu\nu}$$ matter only couples to 10 components out of the 20 components in the Riemann curvature tensor. (The Riemann tensor is the physically observable quantity in general relativity.) The other 10 components are in the Weyl tensor. They are the part of the gravitational field that is present in vacuum, so they must include at least the Newtonian potential. By analogy with electrodynamics they also include gravitational radiation.
In the specific case of the Schwarschild and Kerr metrics, not only are all the components of the Ricci tensor 0, one can in fact arrange for all the components of the Weyl tensor except one to be 0 also. This is sort of analogous to how in electrostatics you can always choose the gauge so that the vector potential $\mathbf A = 0$. Perhaps you can think of this as saying that these metrics do not radiate, so only the part of the gravitational field whose limit is the Newtonian potential exists. (But there are radiating metrics with the same property, so maybe this isn't a good way to think.).
There are other vacuum metrics where fewer of the Weyl tensor's components can be made 0, or some gauge freedom remains. It is common to classify metrics along this scheme, which is called Petrov type. In a really famous paper Newman and Penrose show that the Petrov type of gravitational radiation has a near field - transition zone - radiation zone behavior, where more components of the Weyl tensor become irrelevant the further away from the source you go. (This is analogous with electrodynamics again, since in the radiation zone the EM field is transverse, but in the near field it is not.)
Best Answer
Einstein's equations can be loosely summarized as the main relation between matter and the geometry of spacetime. I will try to give a qualitative description what every term in the equation signifies. I will, however, have to warn potential readers that this will not be a short answer. Furthermore, I will refrain from trying to derive the equations in "elementary" manner, as I certainly don't know of any.
Matter
On the right hand side of the equation, the most important thing is the appearance of the energy-momentum tensor $T_{\mu\nu}$. It encodes exactly how the matter---understood in a broad sense, i.e. any energy (or mass or momentum or pressure) carrying medium---is distributed in the universe. For understanding how to interpret the subscript indices of the $T$, see my explanation of the metric tensor below.
It is multiplied by some fundamental constants of nature $\Big($the factor $\frac{8\pi G}{c^4}\Big)$ but this isn't of any crucial importance: One can view them as book-keeping tools that keep track of the units of the quantities that are related by the equation. In fact, professional physicists typically take the liberty to redefine our units of measurements in order to simplify the look of our expressions by getting rid of pesky constants such as this. One particular option would be to choose "reduced Planck units", in which $8\pi G=1$ and $c=1$, so that the factor becomes $1$.
Differential geometry
On the left hand side of Einstein's equations, we find a few different terms, which together describe the geometry of spacetime. General relativity is a theory which uses the mathematical framework known as (semi-)Riemannian geometry. In this branch of mathematics, one studies spaces which are in a certain sense smooth, and that are equipped with a metric. Let us first try to understand what these two things mean.
The smoothness property can be illustrated by the intuitive (and historically important!) example of a smooth (two-dimensional) surface in ordinary three-dimensional space. Imagine, for instance, the surface of an idealized football, i.e. a 2-sphere. Now, if one focuses ones attention to a very small patch of the surface (hold the ball up to your own face), it seems like the ball is pretty much flat. However, it is obviously not globally flat. Without regards for mathematical rigor, we can say that spaces that have this property of appearing locally flat are smooth in some sense. Mathematically, one calls them manifolds. Of course, a globally flat surface such as an infinite sheet of paper is the simplest example of such a space.
In Riemannian geometry (and differential geometry more generally) one studies such smooth spaces (manifolds) of arbitrary dimension. One important thing to realize is that they can be studied without imagining them to be embedded in a higher-dimensional space, i.e. without the visualization we were able to use with the football, or any other reference to what may or may not be "outside" the space itself. One says that one can study them, and their geometry, intrinsically.
The metric
When it comes to intrinsically studying the geometry of manifolds, the main object of study is the metric (tensor). Physicists typically denote it by $g_{\mu\nu}$. In some sense, it endows us with a notion of distance on the manifold. Consider a two-dimensional manifold with metric, and put a "coordinate grid" on it, i.e. assign to each point a set of two numbers, $(x,y)$. Then, the metric can be viewed as a $2\times 2$ matrix with $2^2=4$ entries. These entries are labeled by the subscripts $\mu,\nu$, which can each be picked to equal $x$ or $y$. The metric can then be understood as simply an array of numbers:
$$\begin{pmatrix} g_{xx}&g_{xy}\\ g_{yx}&g_{yy}\end{pmatrix}$$
We should also say that the metric is defined such that $g_{\mu\nu}=g_{\nu\mu}$, i.e. it is symmetric with respect to its indices. This implies that, in our example, $g_{xy}=g_{yx}$. Now, consider two points that are nearby, such that the difference in coordinates between the two is $(\mathrm{d}x,\mathrm{d}y)\;.$ We can denote this in shorthand notation as $\mathrm{d}l^\mu$ where $\mu$ is either $x$ or $y\;,$ and $\mathrm{d}l^x=\mathrm{d}x$ and $\mathrm{d}l^y=\mathrm{d}y\;.$ Then we define the square of the distance between the two points, called $\mathrm{d}s\;,$ as
$$\mathrm{d}s^2= g_{xx}\mathrm{d}x^2+g_{yy} \mathrm{d}y^2 + 2 g_{xy}\mathrm{d}x \mathrm{d}y= \sum_{\mu,\nu\in\{x,y\}}g_{\mu\nu}\mathrm{d}l^\mu \mathrm{d}l^\nu$$
To get some idea of how this works in practice, let's look at an infinite two-dimensional flat space (i.e. the above-mentioned sheet of paper), with two "standard" plane coordinates $x,y$ defined on it by a square grid. Then, we all know from Pythagoras' theorem that
$$\mathrm{d}s^2=\mathrm{d}x^2+\mathrm{d}y^2= \sum_{\mu,\nu\in\{x,y\}}g_{\mu\nu}\mathrm{d}l^\mu \mathrm{d}l^\nu$$
This shows that, in this case, the natural metric on flat two-dimensional space is given by
$$g_{\mu\nu}=\begin{pmatrix} g_{xx}&g_{xy}\\ g_{xy}&g_{yy}\end{pmatrix}= \begin{pmatrix} 1&0\\0&1\end{pmatrix}$$
Now that we known how to "measure" distances between nearby points, we can use a typical technique from basic physics and integrate small segments to obtain the distance between points that are further removed:
$$L=\int \mathrm{d}s= \int \sqrt{\sum_{\mu,\nu\in\{x,y\}}g_{\mu\nu}\mathrm{d}l^\mu \mathrm{d}l^\nu}$$
The generalization to higher dimensions is straightforward.
Curvature tensors
As I tried to argue in the above, the metric tensor defines the geometry of our manifold (or spacetime, in the physical case). In particular, we should be able to extract all the relevant information about the curvature of the manifold from it. This is done by constructing the Riemann (curvature) tensor $R^{\mu}_{\ \ \ \nu\rho\sigma}$, which is a very complicated object that may, in analogy with the array visualization of the metric, be regarded as a four-dimensional array, with each index being able to take on $N$ values if there are $N$ coordinates $\{x^1,\dots x^N\}$ on the manifold (i.e. if we're dealing with an $N$-dimensional space). It is defined purely in terms of the metric in a complicated way that is not all too important for now. This tensor holds pretty much all the information about the curvature of the manifold---and much more than us physicists are typically interested in. However, sometimes it is useful to take a good look at the Riemann tensor if one really wants to know what's going on. For instance, an everywhere vanishing Riemann tensor ($R^\mu_{\ \ \ \nu\rho\sigma}=0$) guarantees that the spacetime is flat. One famous case where such a thing is useful is in the Schwarzschild metric describing a black hole, which seems to be singular at the Schwarzschild radius $r=r_s\neq 0$. Upon inspection of the Riemann tensor, it becomes apparent that the curvature is actually finite here, so one is dealing with a coordinate singularity rather than a "real" gravitational singularity.
By taking certain "parts of" the Riemann tensor, we can discard some of the information it contains in return for having to only deal with a simpler object, the Ricci tensor:
$$ R_{\nu\sigma}:=\sum_{\mu\in \{x^1,\dots x^N\}} R^\mu_{\ \ \ \nu\mu\sigma}$$
This is one of the tensors that appears in the Einstein field equations. the second term of the equations features the Ricci scalar $R$, which is defined by once again contracting (a fancy word for "summing over all possible index values of some indices") the Ricci tensor, this time with the inverse metric $g^{\mu\nu}$ which can be constructed from the usual metric by the equation
$$\sum_{\nu\in\{x^1,\dots,x^N\}}g^{\mu\nu}g_{\nu\rho}= 1\ \text{if }\mu=\rho\ \text{and }0\ \text{otherwise}$$
As promised, the Ricci scalar is the contraction of the Ricci tensor and inverse metric:
$$R:=\sum_{\mu,\nu\in\{x^1,\dots x^N\}}g^{\mu\nu}R_{\mu\nu} $$
Of course, the Ricci scalar once again contains less information than the Ricci tensor, but it's even easier to handle. Simply multiplying it by $g_{\mu\nu}$ once again results in a two-dimensional array, just like $R_{\mu\nu}$ and $T_{\mu\nu}$ are. The particular combination of curvature tensors that appears in the Einstein field equations is known as the Einstein tensor
$$ G_{\mu\nu}:=R_{\mu\nu}-\frac{1}{2}R g_{\mu\nu} $$
The cosmological constant
There is one term that we have left out so far: The cosmological constant term $\Lambda g_{\mu\nu}$. As the name suggests, $\Lambda$ is simply a constant which multiplies the metric. This term is sometimes put on the other side of the equation, as $\Lambda$ can be seen as some kind of "energy content" of the universe, which may be more appropriately grouped with the rest of the matter that is codified by $T_{\mu\nu}$.
The cosmological constant is mainly of interest because it provides a possible explanation for the (in)famous dark energy that seems to account for certain important cosmological observations. Whether or not the cosmological constant is really non-zero in our universe is an open issue, as is explaining the value observations suggest for it (the so-called cosmological constant problem a.k.a. "the worst prediction of theoretical physics ever made", one of my personal interests).
PS. As pointed out in the comments, if you enjoyed this you may also enjoy reading this question and the answers to it, which address that other important equation of general relativity, which describes the motion of "test particles" in curved spacetimes.