The situation is indeed a delicate one and one needs to carefully check the conventions before transferring a result from one context to another. The situation is summarized in Royden's Real Analysis, though with just a few examples.
For a given space $X$, the main players are:
$\mathcal{B}a$ — the σ-algebra generated by the compact Gδ sets. (The Baire algebra.)
$\mathcal{B}c$ — the smallest σ-algebra with respect to which all continuous real-valued functions are measurable.
$\mathcal{B}k$ — the σ-algebra generated by the compact sets.
$\mathcal{B}o$ — the σ-algebra generated by the closed sets. (The Borel algebra.)
$\mathcal{S}$ — the smallest σ-ring containing the compact sets.
$\mathcal{R}$ — the smallest σ-ring containing the compact Gδ sets.
In general, we have the inclusions
$$\mathcal{B}a \subseteq \mathcal{B}k \subseteq \mathcal{B}o\quad\mbox{and}\quad\mathcal{B}a \subseteq \mathcal{B}c \subseteq \mathcal{B}o,$$
but $\mathcal{B}c$ and $\mathcal{B}k$ are not necessarily related. Moreover, $\mathcal{B}a = \mathcal{B}c \cap \mathcal{B}k$ and $\mathcal{B}o$ is generated by $\mathcal{B}c \cup \mathcal{B}k$. The σ-rings $\mathcal{S}$ and $\mathcal{R}$ consist of the σ-bounded elements of $\mathcal{B}a$ and $\mathcal{B}o$, respectively.
When $X$ is σ-compact and locally compact, then
$$\mathcal{R} = \mathcal{B}a = \mathcal{B}c
\quad\mbox{and}\quad\mathcal{S} = \mathcal{B}k = \mathcal{B}o.$$
A compact example showing the strict inequality is $\beta\mathbb{N}$.
When $X$ is metrizable, more generally when closed sets are Gδ, we have
$$\mathcal{R} = \mathcal{S} \subseteq \mathcal{B}a = \mathcal{B}k \subseteq \mathcal{B}c = \mathcal{B}o.$$
An example showing strict inequality is the space of irrational numbers.
When $X$ is locally compact and separable, then we have
$$\mathcal{R} = \mathcal{S} = \mathcal{B}a = \mathcal{B}k = \mathcal{B}c = \mathcal{B}o.$$
That said, the uniqueness (up to normalization) of Haar measure and it's (inner) regularity guarantee that all constructions will agree on their common domain of definition. Proving this from scratch does require some work depending on where you start and end.
From a geometric measure theory perspective, it is standard to define Radon measures $\mu$ to be Borel regular measures that give finite measure to any compact set. Of course, their connection with linear functionals is very important, but in all the references I know, they start with a notion of a Radon measure and then prove representation theorems that represent linear functionals by integration against Radon measures.
Here are some examples:
$\color{blue}{I:}$ Evans and Gariepy's Measure Theory and Fine Properties of Functions states it this way:
- A [outer] measure $\mu$ on $X$ is regular if for each set $A \subset X$ there exists a $\mu$-measurable set $B$ such that $A\subset B$ and $\mu(A)=\mu(B)$.
- A measure $\mu$ on $\Bbb{R}^n$ is called Borel if every Borel set is $\mu$-measurable.
- A measure $\mu$ on $\Bbb{R}^n$ is Borel regular if $\mu$ is Borel and for each $A\subset\Bbb{R}^n$ there exists a Borel set $B$ such that $A\subset B$ and $\mu(A) = \mu(B)$.
- A measure $\mu$ on $\Bbb{R}^n$ is a Radon measure if $\mu$ is Borel regular and $\mu(K) < \infty$ for each compact set $K\subset \Bbb{R}^n$.
$\color{blue}{II:}$ In De Lellis' very nice exposition of Preiss' big paper, he doesn't even define Radon explicitly, but rather talks about Borel Regular measures that are also locally finite, by which he means $\mu(K) < \infty$ for all compact $K$. His Borel regular is a bit different in that he only considers measurable sets -- $\mu$ is Borel regular if any measurable set $A$ is contained in a Borel set $B$ such that $\mu(A) = \mu(B)$. (I am referring to Rectifiable Sets, Densities and Tangent Measures by Camillo De Lellis.)
$\color{blue}{III:}$ In Leon Simon's Lectures on Geometric Measure Theory, he defines Radon measures on locally compact and separable spaces to be those that are Borel Regular and finite on compact subests.
$\color{blue}{IV:}$ Federer 2.2.5 defines Radon Measures to be measure a $\mu$, over a locally compact Hausdorff spaces, that satisfy the following three properties:
If $K\subset X$ is compact, then $\mu(K) < \infty$.
If $V\subset X$ is open, then $V$ is $\mu$ measurable and
$\hspace{1in} \mu(V) = \sup\mu(K): K\text{ is compact, } K\subset V$
If $A\subset X$, then
$\hspace{1in} \mu(A) = \inf\mu(V): V\text{ is open, } A\subset V$
Note: it is a theorem (actually, a Corollary 1.11 in Mattila's Geometry of Sets and Measures in Euclidean Spaces) that a measure is a Radon a la Federer if and only if it is Borel Regular and locally finite. I.e {Federer Radon} $\Leftrightarrow$ {Simon or Evans and Gariepy Radon}. (I am referring of course to Herbert Federer's 1969 text Geometric Measure Theory.)
$\color{blue}{V:}$ For comparison, Folland (in his real analysis book) defines things a bit differently. For example, he defines regularity differently than the first, third and fourth texts above. In those, a measure $\mu$ is regular if for any $A\subset X$ there is a $\mu$-measurable set $B$ such that $A\subset B$ and $\mu(A) = \mu(B)$. In Folland, a Borel measure $\mu$ is regular if all Borel sets are approximated from the outside by open sets and from the inside by compact sets. I.e. if
$\hspace{1in}\mu(B) = \inf \mu(V): V\text{ is open, } B\subset V$
and
$\hspace{1in}\mu(B) = \sup \mu(K): K\text{ is compact, } K\subset B$
for all Borel $B\subset X$.
Folland's definition of Radon is very similar to Federer's but not quite the same:
A measure $\mu$ is Radon if it is a Borel measure that satisfies:
If $K\subset X$ is compact, then $\mu(K) < \infty$.
If $V\subset X$ is open, then
$\hspace{1in} \mu(V) = \sup\mu(K): K\text{ is compact, } K\subset V$
If $A\subset X$ and $A$ is Borel then
$\hspace{1in} \mu(A) = \inf\mu(V): V\text{ is open, } A\subset V$
... and by Borel measure, Folland means a measure whose measuralbe sets are exactly the Borel sets.
Discussion: Why choose one definition over another? Partly personal preference -- I prefer the typical approach taken in geometric measure theory, starting with an outer measure and progressing to Radon measures a la Evans and Gariepy or Simon or Federer or Mattila. It seems, somehow, more natural and harmonious with the Caratheodory criterion and Caratheodory construction used to generate measures, like the Hausdorff measures.
With this approach, for example, sets with an outer measure of 0 are automatically measurable.
Another reason not to use the more restrictive definition 2 (in the question above) it makes sense to require that continuous images of Borel sets be measurable. But all we know is that continuous maps map Borel to Suslin sets. And there are Suslin sets which are not Borel! If we use the definition of Borel regular, as in I,III and IV above, then Suslin sets are measurable. There is a very nice discussion of this in section 1.7 of Krantz and Parks' Geometric Integration Theory -- see that reference for the definition of Suslin sets. (Krantz and Parks is yet another text I could have added to the above list that agrees with I, III, and IV as far as Radon, Borel regular, etc. goes.
Best Answer
The book Probability measures on metric spaces by K. R. Parthasarathy is my standard reference; it contains a large subset of the material in Convergence of probability measures by Billingsley, but is much cheaper! Parthasarathy shows that every finite Borel measure on a metric space is regular (p.27), and every finite Borel measure on a complete separable metric space, or on any Borel subset thereof, is tight (p.29). Tightness tends to fail when separability is removed, although I don't know any examples offhand.
(Definitions used in Parthasarathy's book: $\mu$ is regular if for every measurable set $A$, $\mu(A)$ equals the supremum of the measures of closed subsets of $A$ and the infimum of open supersets of $A$. We call $\mu$ tight if $\mu(A)$ is always equal to the supremum of the measures of compact subsets of $A$. Some other texts use "regular" to mean "regular and tight", so there is some room for confusion here.)