As mentioned by Yuval in the comments, this question has previously been discussed on MathOverflow. I have replicated the accepted answer by Mark below.
Here is an argument that may give some intuition:
Assume that $m^{*}$ is an outer measure on $X$, and let us assume furthermore that this outer measure is finite:
$m^* (X) < \infty$
Define an "inner measure" $m_*$ on $X$ by
$m_* (E) = m^* (X) - m^* (E^c) $
If $m^*$ was, say, induced from a countably additive measure defined on some algebra of sets in $X$ (like Lebesgue measure is built using the algebra of finite disjoint unions of intervals of the form $(a,b]$), then a subset of $X$ will be measurable in the sense of Caratheodory if and only if its outer measure and inner measure agree.
From this viewpoint, the construction of the measure (as well as the $\sigma$-algebra of measurable sets) is just a generalization of the natural construction of the Riemann integral on $\mathbb{R}^n$ - you try to approximate the area of a bounded set $E$ from the outside by using finitely many rectangles, and similarly from the inside, and the set is "measurable in the sense of Riemann" (or "Jordan measurable") if the best outer approximation of its area agrees with the best inner approximation of its area.
The point here (which often isn't emphasized when Riemann integration is taught for the first time) is that the concept of "inner area" is redundant and can be defined in terms of the outer area just as I did above (you take some rectangle containing the set and consider the outer measure of the complement of the set with respect to this rectangle).
Of course, Caratheodory's construction doesn't require $m^*$ to be finite, but I still think that this gives some decent intuition for the general case (unless you think that the construction of the Riemann integral itself is not intuitive :) ).
Best Answer
I think the following is a pretty plausible "story."
We start with the notion of Lebesgue outer measure: we define the diameter of an open interval in the obvious way, and then set $\mu^*(A)$ to be the infimum over all covers $\mathcal{C}$ of $A$ by open intervals of $\sum_{I\in\mathcal{C}}diam(I)$. This is already a nontrivial notion, with $\mu^*(\mathbb{Q})=0$ in contrast to the situation with respect to Jordan measure and $\mu^*([0,1])$ requiring some effort, but this early work (in my opinion) strongly suggests that this is a natural notion to consider.
Now one of the earliest general results one proves about $\mu^*$ is its finite subadditivity: $\mu^*(A\sqcup B)\le \mu^*(A)+\mu^*(B)$. It's natural to ask whether we can prove equality in this case, given that the union in question is disjoint ("$\sqcup$" instead of "$\cup$"):
One's utter failure to find a proof that the answer is yes will quickly suggest that one should look for a counterexample; however, it's also pretty easy to come to the conclusion that any "reasonably natural" disjoint sets will satisfy the relevant equality. So to find a counterexample, we need to start thinking in terms of coarse properties which will ensure bad measure-combining behavior. This naturally suggests arithmetic with infinity (especially after one proves countable subadditivity of Lebesgue outer measure, and countable closure of the ideal of null sets).
Specifically, if we can partition $[0,1]$ into countably many pieces $A_i$ ($i\in\mathbb{N}$), which are each guaranteed to have the same outer measure $k$, then we'll know that finite additivity must fail:
We can't have $k=0$ because a quick argument shows that the union of countably many null sets is null.
But if $k>0$, then by the Archimedean principle there is some $n\in\mathbb{N}$ such that $nk>1$; consequently, we have $$\mu^*(A_1\sqcup...\sqcup A_n)\le \mu^*([0,1])=1<nk=\mu^*(A_1)+...+\mu^*(A_n),$$ which is a clear failure of finite additivity.
This seems like a great idea ... for about five seconds. The problem is the bolded clause a few sentences ago: partitioning $[0,1]$ into countably infinitely many pieces is easy, but why should we at the outset know that those pieces (which after all we want to be weird) will all look the same measure-wise?
Ultimately we're saved here by the realization that certain operations on sets preserve outer measure. In particular, for each $\alpha\in\mathbb{R}$ and $X\subseteq[0,1)$ the "mod 1 sum" $$X\star \alpha:=\{x+\alpha: x\in X, x+\alpha< 1\}\sqcup\{x+\alpha-1: x\in X, x+\alpha\ge 1\}$$ has the same outer measure as $X$ itself. To get the countable union we desire, we pick some nicely messy countable set - like $\mathbb{Q}$ - and hope for a positive answer to the following:
At this point we're extremely close to the definition of the Vitali set, and the idea of passing to mod-$\mathbb{Q}$ equivalence classes is not too big a leap.