The starting point for abstract measure theoretic conditional probability is conditional expectation. Essentially, one uses the identity $P(A)=\mathbb{E}(1_A)$.
Now let $(\Omega,\mathcal{B},P)$ be a probability space, $f$ a random variable and $\mathcal{G}$ a sub-$\sigma$-algebra of $\mathcal{B}$. The conditional expectation of $f$ with respect to $\mathcal{G}$ is a $\mathcal{G}$-measurable function $\mathbb{E}_\mathcal{B}$ such that for all $G\in\mathcal{G}$ $$\int_G \mathbb{E}_\mathcal{B}~dP=\int_G f~dP.$$ The notion is not very intuitive, but the idea is the following: Since $\mathbb{E}_\mathcal{B}$ is $\mathcal{G}$-measurable, it uses only the information in $\mathcal{G}$. The integral condition says that $\mathbb{E}_\mathcal{B}$ "averages $f$ out" over sets in $\mathcal{G}$.
Now if we want to calculate the conditional probability of the event $H\in\mathcal{B}$ with respect to the sub-$\sigma$-algebra $\mathcal{G}$, we simply take the conditional expectation of the indiacator function $1_H$. Then, a conditional probability of $H$ with respect to $\mathcal{G}$ is a $\mathcal{G}$-measurable function $\mathbb{P}^H_\mathcal{G}$ such that for all $G\in\mathcal{G}$ $$\int_G \mathbb{P}^H_\mathcal{G}~dP=\int_G 1_H~dP.$$ Since $\int_G 1_H~dP=P(H\cap G)$, this can be rewritten as $$\int_G \mathbb{P}^H_\mathcal{G}~dP=P(H\cap G).$$
This is fairly standard material, so I assume the author made simply some typos. The $h$ is superflous and the $m$ should be $P$.
Let throughout this post $(\Omega,\mathcal{F},P)$ be a probability space, and let us first define the conditional expectation ${\rm E}[X\mid\mathcal{G}]$ for integrable random variables $X:\Omega\to\mathbb{R}$, i.e. $X\in L^1(P)$, and sub-sigma-algebras $\mathcal{G}\subseteq\mathcal{F}$.
Definition: The conditional expectation ${\rm E}[X\mid\mathcal{G}]$ of $X$ given $\mathcal{G}$ is the random variable $Z$ having the following properties:
(i) $Z$ is integrable, i.e. $Z\in L^1(P)$.
(ii) $Z$ is ($\mathcal{G},\mathcal{B}(\mathbb{R}))$-measurable.
(iii) For any $A\in\mathcal{G}$ we have
$$
\int_A Z\,\mathrm dP=\int_A X\,\mathrm dP.
$$
Note: It makes sense to talk about the conditional expectation since if $U$ is another random variable satisfying (i)-(iii) then $U=Z$ $P$-a.s.
Definition: If $X\in L^1(P)$ and $Y:\Omega\to\mathbb{R}$ is any random variable, then the conditional expectation of $X$ given $Y$ is defined as
$$
{\rm E}[X\mid Y]:={\rm E}[X\mid\sigma(Y)],
$$
where $\sigma(Y)=\{Y^{-1}(B)\mid B\in\mathcal{B}(\mathbb{R})\}$ is the sigma-algebra generated by $Y$.
I'm not aware of any other definition of $P(Y\in B\mid X\in A)$ than the obvious, i.e.
$$
P(Y\in B\mid X\in A)=\frac{P(Y\in B,X\in A)}{P(X\in A)}
$$
provided that $P(X\in A)>0$. The only exception being when $A$ contains a single point, i.e. $A=\{x\}$ for some $x\in\mathbb{R}$. In this case, the object $P(Y\in B\mid X=x)$ is defined in terms of a regular conditional distribution.
Let us first define regular conditional probabilities. Let $X:\Omega\to\mathbb{R}$ be a random variable.
Definition: A regular conditional probability for $P$ given $X$ is a function
$$
\mathcal{F}\times \mathbb{R} \ni(A,x)\mapsto P^X(A\mid x)
$$
satisfying the following three conditions:
(i) The mapping $A\mapsto P^X(A\mid x)$ is a probability measure on $(\Omega,\mathcal{F})$ for all $x\in \mathbb{R}$.
(ii) The mapping $x\mapsto P^X(A\mid x)$ is $(\mathcal{B}(\mathbb{R}),\mathcal{B}(\mathbb{R}))$-measurable for all $A\in\mathcal{F}$.
(iii) The defining equation holds: For any $A\in\mathcal{F}$ and $B\in\mathcal{B}(\mathbb{R})$ we have
$$
\int_B P^X(A\mid x)\,P_X(\mathrm dx)=P(A\cap\{X\in B\}).
$$
Note: A mapping satisfying (i) and (ii) is often called a Markov kernel. Furthermore, since $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ is a nice space, the regular conditional probability is unique in the sense that if $\tilde{P}^X(\cdot\mid\cdot)$ is another regular conditional probability of $P$ given $X$, then we have that $P^X(\cdot\mid x)=\tilde{P}^X(\cdot\mid x)$ for $P_X$-a.a. $x$. Here $P_X=P\circ X^{-1}$ is the distribution of $X$.
Connection: Let $P^X(\cdot\mid\cdot)$ be a regular conditional probability of $P$ given $X$. Then for any $A\in\mathcal{F}$ we have
$$
{\rm E}[1_A\mid X]=\varphi(X),
$$
where $\varphi(x)=P^X(A\mid x)$. In short we write ${\rm E}[1_A\mid X]=P^X(A\mid X)$.
Now let us introduce another random variable $Y:\Omega\to\mathbb{R}$, and $P^X(\cdot\mid \cdot)$ still denotes a regular conditional probability of $P$ given $X$.
Definition: For $B\in\mathcal{B}(\mathbb{R})$ and $x\in\mathbb{R}$ we define the regular conditional distribution of $Y$ given $X$ by
$$
P_{Y\mid X}(B\mid x):=P^X(Y\in B\mid x).
$$
Instead of $P_{Y\mid X}(B\mid x)$ one often writes $P(Y\in B\mid X=x)$.
An easy consequence of this definition is that $(B,x)\mapsto P_{Y\mid X}(B\mid x)$ is a Markov kernel and for any $A,B\in\mathcal{B}(\mathbb{R})$ we have
$$
\int_A P_{Y\mid X}(B\mid x)\,P_X(\mathrm dx)=P(\{X\in A\}\cap\{Y\in B\}). \tag{1}
$$
In fact, $P_{Y\mid X}(\cdot \mid \cdot)$ is a regular conditional distribution of $Y$ given $X$ if and only if $P_{Y\mid X}(\cdot\mid\cdot)$ is a Markov kernel and satisfies $(1)$. Again $(1)$ is often referred to as the defining equation.
Definition: Let $P^X(\cdot\mid\cdot)$ be a regular conditional probability of $P$ given $X$. Furthermore, let $U:\Omega\to\mathbb{R}$ be another random variable that is assumed bounded (to ensure the following expectations exist). Then we define the (regular) conditional mean of $U$ given $X=x$ by
$$
{\rm E}[U\mid X=x]:=\int_\Omega U(\omega)\, P^X(\mathrm d\omega\mid x).
$$
Let us denote $\psi(x)={\rm E}[U\mid X=x]$. Then we have the following:
Connection: The mapping $\mathbb{R}\ni x\mapsto \psi(x)$ is $(\mathcal{B}(\mathbb{R}),\mathcal{B}(\mathbb{R}))$-measurable, and
$$
{\rm E}[U\mid X]=\psi(X).
$$
The following is an extremely useful rule when calculating with conditional distributions:
Rule: Let $X$ and $Y$ be as above, and let $\xi:\mathbb{R}^2\to\mathbb{R}$ be $(\mathcal{B}(\mathbb{R}^2),\mathcal{B}(\mathbb{R}))$-measurable. Then
$$
P(\xi(X,Y)\in D\mid X=x)=P(\xi(x,Y)\in D\mid X=x),\quad D\in\mathcal{B}(\mathbb{R}),
$$
holds for $P_X$-a.a. $x$. This is saying that "conditional on $X=x$ we may replace $X$ by $x$".
The following example shows how this rule can be useful: Let $X$ and $Y$ be independent $\mathcal{N}(0,1)$ random variables, and let $U=X+Y$. Then we claim that $U\mid X=x\sim \mathcal{N}(x,1)$ for $P_X$-a.a. $x$. To see this, note that by the rule above, the distribution of $U\mid X=x$ and $Y+x\mid X=x$ is the same. But since $Y$ is independent of $X$ we have that $Y+x\mid X=x$ is distributed as $Y+x$. We can write it as follows:
$$
U\mid X=x\sim Y+x\mid X=x\sim Y+x\sim\mathcal{N}(x,1).
$$
Best Answer
The short answer
Yes, if $N=\mathbb{R}^n$ and $\mathcal{N}$ is the Borel field, then, for every $A \in \mathcal{M}$, the limit $\lim_{\Delta y \mapsto 0} P(X \in A\ |\ Y \in (y-\Delta y, y+\Delta y))$ exists $P_Y$-almost surely, and this is true whether or not $Y = f(X)$, and whether or not $X,Y$ have a joint density. (This is the content of Theorem 9(1) below.)
Furthermore, the function $f:\mathbb{R}\rightarrow\mathbb{R}$ obtained by setting $f(y) := \lim_{\Delta y \mapsto 0} P(X \in A\ |\ Y \in (y-\Delta y, y+\Delta y))$ wherever possible, and, say, $f(y):=0$ elsewhere, is consistent with the traditional measure-theoretic definition of $P(X\in A\ |\ Y=y)$, given by \eqref{CondProb}. (This is the content of Corollary 17(1).)
The proof of these facts is a consequence of theorems 1.29 ("Differentiating measures") and 1.30 ("Differentiation of Radon measures") in reference [3] (a shout-out to user Del, who pointed me to this reference). It makes use of the concept of the derivative of an outer measure w.r.t another outer measure.
I will devote the rest of this answer to carefully derive facts 1 and 2 stated above. As far as I know, this is the first time this fundamental, intuitive result, which is often claimed (for instance, on p. 157 of [4], p. 136 of [1]), is proved. I'll be grateful (if somewhat disappointed) to anyone who can cite a precedence.
Example
Before embarking on a formal proof, let's see how the results of the next section can be used to introduce the concept of "probability conditioned on a non-discrete random variable" in a way that is simultaneously intuitive and mathematically sound.
Consider, for instance, the following excerpt taken from a popular undergraduate textbook ([6] example 5e, p. 255).
Letting $N$ denote the number of successes, and letting $X$ denote the probability that a given trial is a success, this excerpt gives natural rise to the concept of conditional probability, but when we attempt to parlay our intuition into formulas, we discover that an expression of the form $P(N = n\ |\ X=x)$ is not well-defined as per the familiar formula $P(A|B) = \frac{P(A\cap B)}{P(B)}$, since $P(X=x)=0$.
Intuition suggests to overcome this obstacle by defining $$ P(N=n\ |\ X=x) = \lim_{\Delta x\downarrow 0} \frac{P(N=n, x - \Delta x < X < x + \Delta x)}{P(x - \Delta x < X < x + \Delta x)}, $$ provided the limit exists. Theorem 9(1) assures us that the limit indeed exists almost everywhere. Theorem 17(1) implies that if we so define $P(N=n\ |\ X=x)$ wherever possible, we will obtain a function that is a conditional probability in the traditional measure-theoretic sense (described in the next paragraph), hence we may soundly subject it to the usual manipulations involving conditional probabilities, such as the law of total probability. Note that in this example the joint random variable $(N,X)$ does not have a joint density (more precisely no joint density w.r.t. the Lebesgue measure on $\mathbb{R}^2$).
Now scratch everything we have discussed so far, and suppose we start by defining the conditional probability $P(X\in A\ |\ Y=y)$ in the traditional measure-theoretic manner (cf. [2] theorem 5.3.1, p. 205) as
(This concept of conditional probability is sometimes called "conditional distribution", as in [5] theorem 6.3, p. 107, and the term "conditional probability" is reserved to a closely-related, but different concept. I will keep to the "conditional probability" terminology.)
Given a two-dimensional random variable $(X,Y)$ with a joint density $f(x,y)$, we may now prove that the familiar definition of "conditional density", namely $$ f_{X|Y=y}(x) = \frac{f(x,y)}{f_Y(y)},\hspace{1cm}\text{wherever the denominator does not vanish} $$ can be used to generate conditional probabilities of the form $P(X\in A\ |\ Y=y)$.
Applying this technique to the following problem, taken from the same textbook ([6] example 5b, p. 252), we find that $P(X > 1\ |\ Y=y) = e^{-1/y}$, $y>0$.
Since the solution we obtained is continuous, Corollary 17(2) yields that, for every $y>0$, $$ e^{-1/y} = \lim_{\Delta y\downarrow 0}\frac{P(X>1, y-\Delta y<Y<y+\Delta y)}{P(y-\Delta y<Y<y+\Delta y)}. $$
The formal derivation
Notation 1 Let $n \in \{1, 2, \dots\}$ and let $r \in (0,\infty)$. For every $x \in \mathbb{R}^n$ we denote the open $n$-ball of (Euclidean) radius $r$ about $x$ by $B^{(n)}_r(x)$.
Notation 2 Let $n \in \{1, 2, \dots\}$. We denote the Euclidean topology on $\mathbb{R}^n$ by $\mathcal{E}_n$.
Notation 3 Let $n \in \{1, 2, \dots\}$. We denote the Borel $\sigma$-algebra on $\mathbb{R}^n$ by $\mathcal{B}_n$.
Notation 4 Let $n \in \{1, 2, \dots\}$. For every outer measure $\mu$ on $\mathbb{R}^n$, we denote the collection of $\mu$-measurable sets by $\mathcal{M}_\mu$.
Fix $n \in \{1, 2, \dots\}$ for the remainder of the proof.
Definition 5 An outer measure $\mu$ on $\mathbb{R}^n$ is Radon iff the following three conditions hold.
$\mathcal{B}_n \subseteq \mathcal{M}_\mu$.
For every $A\subseteq\mathbb{R}^n$ there exists a $B\in\mathcal{B}_n$ such that $A\subseteq B$ and $\mu(A) = \mu(B)$.
For every $\mathcal{E}_n$-compact $K\subseteq\mathbb{R}^n$, $\mu(K) < \infty$.
Definition 6 Let $\mu, \nu$ be Radon outer measures on $\mathbb{R}^n$. We denote by $\mathrm{Diff}^\nu_\mu$ the set consisting of all $x \in \mathbb{R}^n$ for which the following pair of conditions hold.
For all $r \in (0,\infty)$, $\mu\left(B^{(n)}_r(x)\right) > 0$.
There exists some $d \in \mathbb{R}$ that satisfies: $$ d = \lim_{r \downarrow 0} \frac{\nu\left(B^{(n)}_r(x)\right)}{\mu\left(B^{(n)}_r(x)\right)}. $$
Definition 7 Let $\mu, \nu$ be Radon outer measures on $\mathbb{R}^n$. We set $$ D^\nu_\mu(x) := \begin{cases} \lim_{r \downarrow 0} \frac{\nu\left(B^{(n)}_r(x)\right)}{\mu\left(B^{(n)}_r(x)\right)} &, x \in \mathrm{Diff}^\nu_\mu \\ 0 &, \text{otherwise}. \end{cases} $$
Definition 8 Let $\mu, \nu$ be Radon outer measures on $\mathbb{R}^n$. We denote with $\mathrm{\mathbf{Diff}}^\nu_\mu$ the collection consisting of all $Z \subseteq \mathbb{R}^n$ such that the following pair of conditions hold.
$Z \subseteq \mathrm{Diff}^\nu_\mu$.
$Z = \mathbb{R}^n\setminus A$ for some $A \in \mathcal{M}_\mu$ with $\mu(A) = 0$.
Theorem 9 Let $\mu, \nu$ be Radon outer measures on $\mathbb{R}^n$.
$\mathrm{\mathbf{Diff}}^\nu_\mu \neq \emptyset$.
For every $Z \in \mathrm{\mathbf{Diff}}^\nu_\mu$, $D^\nu_\mu$ is $\mathcal{Z}/\mathcal{B}_1$-measurable, where $\mathcal{Z}$ is the subset $\sigma$-algebra induced on $Z$ by $\mathcal{M}_\mu$.
Proof See [3], theorem 1.29, p. 48. Q.E.D.
Definition 10
Let $\mu, \nu$ be outer measures on $\mathbb{R}^n$. $\nu$ is absolutely continuous w.r.t. $\mu$, written $\nu \ll \mu$, provided, $\mu(A) = 0$ implies $\nu(A) = 0$, for every $A \subseteq \mathbb{R}^n$.
Let $\mathcal{F}$ be a $\sigma$-algebra on $\mathbb{R}^n$, and let $\mu, \nu$ be measures on $\mathcal{A}$. $\nu$ is absolutely continuous w.r.t. $\mu$, written $\nu \ll \mu$, provided, $\mu(A) = 0$ implies $\nu(A) = 0$, for every $A \in \mathcal{F}$.
Lemma 11 Let $\mu, \nu$ be measures on $\mathcal{B}_n$ such that $\nu\ll\mu$, and such that, for every $\mathcal{E}_n$-compact $K$, $\mu(K), \nu(K) < \infty$. Then $\mu, \nu$ can be extended to Radon outer-measures on $\mathbb{R}^n$, $\mu^*, \nu^*$, respectively, such that $\nu^*\ll\mu^*$.
Proof
For every $A \subseteq \mathbb{R}^n$ define $$ \begin{align} \mu^*(A) &:= \inf \left\{\sum_{n = 1}^\infty \mu(B_n)\ :\!\big|\ \{B_1, B_2, \dots\} \subseteq \mathcal{B}_n,\ A \subseteq \bigcup_{n=1}^\infty B_n\right\}, \\ \nu^*(A) &:= \inf \left\{\sum_{n = 1}^\infty \nu(B_n)\ :\!\big|\ \{B_1, B_2, \dots\} \subseteq \mathcal{B}_n,\ A \subseteq \bigcup_{n=1}^\infty B_n\right\}. \end{align} $$
According to [7] theorem 2.21 (p. 38), $\mu^*, \nu^*$ are outer-measures on $\mathbb{R}^n$. According to [7] theorem 20.1(b) (p. 502), $\mu^*, \nu^*$ are extensions of $\mu, \nu$, respectively. This implies, in particular, that, for every $\mathcal{E}_n$-compact $K$, $\mu^*(K) = \mu(K) < \infty$. According to [7] theorem 20.1(a) (p. 502), $\mathcal{B}_n \subseteq \mathcal{M}_{\mu^*} \cap \mathcal{M}_{\nu^*}$. According to [7] Proposition 20.9 (p. 507), for every $A\subseteq\mathbb{R}^n$ there exist a $B \in \mathcal{B}_n$ such that $A \subseteq B$ and both $\mu^*(A) = \mu^*(B)$ and $\nu^*(A) = \nu^*(B)$. Thus, $\mu^*$ and $\nu^*$ are each Radon.
Let $A \subseteq \mathbb{R}^n$ be such that $\mu^*(A) = 0$. By the preceding paragraph there exists some $B \in \mathcal{B}_n$ such that $A \subseteq B$ and both $\mu^*(A) = \mu^*(B)$ and $\nu^*(A) = \nu^*(B)$. So $$ \mu(B) = \mu^*(B) = \mu^*(A) = 0. $$ So $$ \nu^*(A) = \nu^*(B) = \nu(B) \overset{\nu\ll\mu}{=} 0. $$ Thus $\nu^* \ll \mu^*$.
Q.E.D.
Theorem 12 Let $\mu, \nu$ be Radon outer measures on $\mathbb{R}^n$, and let $Z \in \mathrm{\mathbf{Diff}}^\nu_\mu$. If $\nu \ll \mu$, then, for every $B \in \mathcal{M}_\mu$, $$ \nu(B) = \int_B D^\nu_\mu\mathbb{1}_Z d\mu. $$
Proof See [3], theorem 1.30, p. 50. Q.E.D.
Definition 13 Let $(\Omega, \mathcal{F}, P)$ be a probability space, and let $Y:\Omega\rightarrow\mathbb{R}^n$ be $\mathcal{F}/\mathcal{B}_n$-measurable. We denote with $P_Y$ the probability measure induced on $\mathcal{B}_n$ by $Y$ via $(\Omega, \mathcal{F}, P)$.
Notation 14 For every probability measure $\mu$ on $\mathcal{B}_n$, we denote by $\overline{\mathcal{B}_n^\mu}$ the completion of $\mathcal{B}_n$ w.r.t. $\mu$, and we denote by $\overline{\mu}$ the unique extension of $\mu$ to $\overline{\mathcal{B}_n^\mu}$.
Definition 15 Let $(\Omega, \mathcal{F}, P)$ be a probability space, let $A \in \mathcal{F}$, and let $Y:\Omega \rightarrow \mathbb{R}^n$ be $\mathcal{F}/\mathcal{B}_n$-measurable. We denote by $P(A\ |\ Y)$ the set of conditional probabilities of $A$ conditioned on $Y$, as follows. $P(A\ |\ Y)$ shall consist of all functions $f:\mathbb{R}^n\rightarrow\mathbb{R}$ that are $\overline{\mathcal{B}_n^{P_Y}}/\mathcal{B}_1$-measurable, $\overline{P_Y}$-semi-integrable, and such that, for every $B \in \mathcal{B}_n$, $$ \int_B f\ d\overline{P_Y} = P\left(A\cap\{Y \in B\}\right). $$
Definition 16 Let $\mu$ be a measure on $\mathcal{B}_n$. We denote $\mu$'s support by $\mathrm{supp}_\mu$. In other words, $\mathrm{supp}_\mu$ consists of all $x \in \mathbb{R}^n$ such that, for every $\mathcal{E}_n$-open-neighborhood, $G$, of $x$, $\mu(G) > 0$.
Corollary 17 Let $(\Omega, \mathcal{F}, P)$ be a probability space, let $A \in \mathcal{F}$, and let $Y:\Omega \rightarrow \mathbb{R}^n$ be $\mathcal{F}/\mathcal{B}_n$-measurable. Set $\mu := P_Y$, and consider the measure $\nu:\mathcal{B}_n\rightarrow\mathbb{R}$ assigning to every $B \in \mathcal{B}_n$ $\nu(B) := P\left(A\cap\{Y \in B\}\right)$. Then $\mu, \nu$ can be extended to Radon outer-measures on $\mathbb{R}^n$, $\mu^*, \nu^*$, respectively, such that:
$D^{\nu^*}_{\mu^*} \in P(A\ |\ Y)$.
For every $y \in \mathrm{supp}_\mu$ at which some $f \in P(A\ |\ Y)$ is $\mathcal{E}_n/\mathcal{E}_1$-continuous, $y \in \mathrm{Diff}^{\nu^*}_{\mu^*}$.
Proof
Since $\mu, \nu$ are finite measures on $\mathcal{B}_n$ such that $\nu \ll \mu$, then, by lemma 11, they may be extended to Radon outer-measures on $\mathbb{R}^n$, $\mu^*, \nu^*$, respectively, such that $\nu^* \ll \mu^*$. Letting $Z \in \mathrm{\mathbf{Diff}}^{\nu^*}_{\mu^*}$, theorem 12 yields that $D^{\nu^*}_{\mu^*}\mathbb{1}_Z \in P(A\ |\ Y)$. Since, by choice of $Z$, $D^{\nu^*}_{\mu^*} = D^{\nu^*}_{\mu^*}\mathbb{1}_Z$ $P_Y$-a.e., the conclusion follows.
Let $y \in \mathrm{supp}_\mu$, and let $f \in P(A\ |\ Y)$ be $\mathcal{E}_n/\mathcal{E}_1$-continuous at $y$.
Let $\varepsilon \in (0,\infty)$. Choose $\delta \in (0,\infty)$ such that, for all $z \in B^{(n)}_\delta(y)$, $f(z) \in B^{(1)}_\varepsilon\left(f(y)\right)$. Let $r \in (0,\delta]$. Since $y \in \mathrm{supp}_\mu$, $P_Y\left(B^{(n)}_r(y)\right) > 0$, and we have $$ \begin{align} \frac{\nu^*\left(B^{(n)}_r(y)\right)}{\mu^*\left(B^{(n)}_r(y)\right)} &= \frac{\nu\left(B^{(n)}_r(y)\right)}{\mu\left(B^{(n)}_r(y)\right)} \\ &= \frac{P\left(A\cap\left\{Y\in B^{(n)}_r(y)\right\}\right)}{P_Y\left(B^{(n)}_r(y)\right)} \\ &= \frac{\int_{B^{(n)}_r(y)}\ f\ d\overline{P_Y}}{P_Y\left(B^{(n)}_r(y)\right)} \\ &<\frac{\int_{B^{(n)}_r(y)}\ f(y) + \varepsilon\ d\overline{P_Y}}{P_Y\left(B^{(n)}_r(y)\right)} \\ &= \frac{(f(y)+\varepsilon)\ \int_{B^{(n)}_r(y)}\ d\overline{P_Y}}{P_Y\left(B^{(n)}_r(y)\right)} \\ &= (f(y)+\varepsilon)\frac{\overline{P_Y}\left(B^{(n)}_r(y)\right)}{P_Y\left(B^{(n)}_r(y)\right)} \\ &= (f(y)+\varepsilon)\frac{P_Y\left(B^{(n)}_r(y)\right)}{P_Y\left(B^{(n)}_r(y)\right)} \\ &= f(y)+\varepsilon. \end{align} $$
Analogously, $$ f(y)-\varepsilon < \frac{\nu^*\left(B^{(n)}_r(y)\right)}{\mu^*\left(B^{(n)}_r(y)\right)}. $$
Q.E.D.
References
[1] Robert B. Ash, Basic Probability Theory, Dover, 2008. (An online version is freely available on the author's website.)
[2] Robert B. Ash, Catherine A. Doléance-Dade, Probability and Measure Theory, 2nd ed., Academic Press, 2000.
[3] Lawrence C. Evans, Ronald F. Gariepy, Measure Theory and Fine Properties of Functions, revised edition, CRC Press, 2015.
[4] William Feller, An Introduction to Probability Theory and Its Applications, Vol. 2, 2nd ed., John Wiley & Sons, 1971.
[5] Olav Kallenberg, Foundations of Probability Theory, 2nd ed., Springer, 2001.
[6] Sheldon M. Ross, A First Course in Probability, 9th ed., Pearson, 2013.
[7] James Yeh, Real Analysis : Theory of Measure and Integration, 3rd ed., World Scientific, 2014.