This cine clip may encapsulate the concept of generating the Lebesgue integral of a continuous bounded function:
We are looking for the supremum of the sum of all possible simple functions (we can think of them as step functions), under the curve, as beautifully explained here. It makes sense to talk about supremum because we approach the value of the area under the curve with simple functions, which have a limited number of steps.
Perhaps unfortunately some of the static iconography on-line may hint at a false impression of a pyramid being built with horizontal slabs and from inside out, as in this slide from Coursera, with brick-like horizontal slabs, as opposed to simple functions:
In Lebesgue integration, there is no "lower sum" and "upper sum" converging, as in the Darboux integral. Likewise there is no overlap between any of the infinity of possible simple functions because of their "step" nature. The Lebesgue integral is defined on a measurable function $f: M\rightarrow \mathbb R$:
$$\int f\,d\mu := \text{sup} \left[\sum_{z\in s(M)} z\,\mu \left( \text{preim}\left(\{z\}\right) \right) \right]$$
where $s$ corresponds to a simple function:
Function $s$ on some measurable set (i.e. set equipped with a $\sigma$ algebra), taking it to the real line: $f: M\rightarrow \mathbb R.$
That takes only finitely many values: $s(M)=\{s_1, s_2,\cdots,s_N\}$ for some $N \in \mathbb N.$
This simple function can be written as $s = \displaystyle \sum_{z\in s(M)} \underbrace{\color{red}{\color{red}{\,z\,}}}_{\small\color{red}{\text{height}}} \, \underbrace{\color{blue}{\chi_{\text{preim}_s}(\{z\})}}_{\small\color{blue}{\text{base}}}$, with $\chi_{\text{preim}_s}(\{z\})$ being the characteristic or indicator function.
The range or codomain of the function is partitioned fixing a maximum value $n$ into $k=0$ to $k = 2^n n$ intervals at heights $\frac{k}{2^n}$ in order to defined a sequence of simple functions $f_n:$
$$f_n(x) = \sum_{k=0}^{n2^n-1}\frac{k}{2^n}\; \chi_{\frac{k}{2^n}\leq f(x)<\frac{k+1}{2^n}}\; +\; n\,\chi_{f(x) \geq n}$$
i.e. with value $n$ if $f(x)$ is equal or greater than $n,$ and otherwise with the lowest value for that interval, $\frac{k}{2^n}.$ In the indicator function above $\frac{k}{2^n}\leq f(x)<\frac{k+1}{2^n}$ can be alternatively expressed as $k \leq 2^n f(x) < k+1$ with $k=\lfloor 2^n f(x) \rfloor.$ This is explained here. Graphically,
Coding a concept helps, so I tried doing so for this post, illustrating the Lebesgue integral of $y = x^2$ between $[0,1]$. The code is here, and the output looks like this:
Finally, here is a systematic "construction" of the Lebesgue integral for an inverted parabola, really showing how the critical step is to partition the range (y axis) in $N$ equally spaced divisions between the $y$ limits of integration, defining $[y_i,y_{i+1}]$ intervals, and look for the measure of the pre-image in the $x$ axis: the difference between the inverse function at a given chosen point, $y^*$ (not a slab) for any given $[y_i, y_{i+1}]$ interval, $f^{-1}(y^*_i)$, and the inverse of the next point in the y axis, $y^*_{i+1}$. Analytically, this "strategy" relies ultimately in letting $N\rightarrow\infty$, which explains the "pyramid" iconography so often found online, which is intuitive, but has the potential to suggest the integral as a pile of horizontal slabs. This was a big hurdle understanding the difference between the Lebesgue and the Darboux integrals. Here is an attempt at a more faithful dynamic representation (again the code is here waiting for improvements):
When we extend the partitions to $100$ (a humble computer version of $\infty$) the calculated area is $1.322926$, really close to the analytical value, and identical to the Riemann integral $2\int_0^1 (-x^2 +1)\, dx=4/3.$
First, it is a sufficient condition.
As long as $f$ is bounded on $[0,1]$, the upper and lower sums corresponding to arbitrary partitions are bounded. Let $\mathcal{P}$ denote the set of all partitions of $[0,1].$ Consequently, the sets $\{L(P,f): P \in \mathcal{P}\}$ and $\{U(P,f): P \in \mathcal{P}\}$ are bounded, and this guarantees the existence of
$$\underline{\int}_0^1 f(x) \, dx = \sup_{P \in \mathcal{P}}\, L(P,f), \\ \overline{\int}_0^1 f(x) \, dx = \inf_{P \in \mathcal{P}}\, U(P,f) , $$
which are called the lower and upper integrals.
Given any regular partition $D_n$ we have
$$L(D_n,f) \leqslant \underline{\int}_0^1 f(x) \, dx \leqslant \overline{\int}_0^1 f(x) \, dx \leqslant U(D_n,f).$$
The central inequality follows because for any partitions $P$ and $Q$ we have $L(P,f) \leqslant U(Q,f)$ (take a common refinement of the partitions to show this) and, thus $\sup_{P \in \mathcal{P}} \,L(P,f) \leqslant \inf_{Q \in \mathcal{P}} \,U(Q,f)$.
Hence,
$$0 \leqslant \overline{\int}_0^1 f(x) \, dx - \underline{\int}_0^1 f(x) \, dx \leqslant U(D_n,f) - L(D_n,f).$$
The right-hand side converges to $0$ as $n \to \infty$, by hypothesis, which implies that $f$ is integrable since we must have
$$\underline{\int}_0^1 f(x) \, dx = \overline{\int}_0^1 f(x) \, dx, $$
where the common value of lower and upper integrals is by definition the value of the integral.
To show it is a necessary condition, consider
$$\left|U(D_n,f) - L(D_n,f) \right| \leqslant \left|U(D_n,f) - \int_0^1 f(x) \, dx \right| + \left|L(D_n,f) - \int_0^1 f(x) \, dx \right|.$$
The two terms on the RHS go to zero as $n \to \infty$. This is a consequence of the equivalent condition for integrability where for arbitrary Riemann sums corresponding to tagged partitions we have
$$\tag{*}\int_0^1 f(x) \, dx = \lim_{\|P\| \to 0} S(P,f).$$
Here $\|P\| = \max_{1 \leqslant j \leqslant n} (x_j - x_{j-1})$ is the norm of the partition $P = (x_0,x_1, \ldots, x_n)$ and, clearly, $\|D_n\| \to 0$ if and only if $n \to \infty$.
It takes a bit of effort to prove the equivalence of $(*)$ to the definition of the Riemann integral in terms of partition refinement or the Darboux approach. It has been shown a number of times on this site including here.
Best Answer
Perhaps you already know most of this, but here are some things to consider.
There is only one definition of Riemann integrability that must be very restrictive for it to work. I am not talking about inproper integrals here. On the other hand, an effective notion of Lebesgue integrability can be defined hierarchically as these restrictive conditions are weakened.
Start with sets of finite measure $E \subset \mathbb{R}$ and bounded functions $f:E \to \mathbb{R}$.
(1) Strictly speaking the Riemann integral is defined for functions on a closed and bounded interval $[a,b]$. Also, it is necessary for the function to be bounded to meet the requirement that there exists $I \in \mathbb{R}$ such that for any $\epsilon > 0$ there exists a partition $P_\epsilon$ of $[a,b]$ such that for any partition $P$ that is a refinement of $P_\epsilon$ and any Riemann sum $S(P,f)$,we have $|S(P,f) - I| < \epsilon$. That $f$ must be bounded is not just an arbitrary part of the definition.
It is, of course, possible to extend the definition to open intervals or even general subsets $E$ of finite measure with $\int_E f$ defined as $\int_a^b f(x) \chi_E(x) \, dx$. Nevetheless, the definition of Riemann integrability can only be met when the measure of the boundary $\partial E$ is $0$, and this is related to the notion of Jordan measurability.
Clearly, there are bounded functions defined on sets of finite measure that are not Riemann integrable -- as with the Dirichlet function you mention -- and this is entirely due to "too much" discontinuity.
(2) Again for bounded functions on sets of finite measure, there always exist lower and upper Lebesgue integrals
$$\underline{\int}_E f = \sup_{\phi \leqslant f} \int_E \phi, \quad \overline{\int_E} f = \inf_{\psi \geqslant f} \int_E \psi,$$
where $\phi$ and $\psi$ are simple functions, and we must have
$$\underline{\int}_E f\leqslant \overline{\int_E} f $$
The most basic definition in this restrictive case is that $f$ is "Lebesgue integrable" on E if
$$\underline{\int}_E f = \overline{\int_E} f$$
There are two important theorems for bounded functions on finite measure sets.
Theorem 1: If a function is Riemann integrable then it is Lebesgue integrable.
Theorem 2: A function is Lebesgue integrable if and only if it is measurable.
An important consequence of Theorem 1 is that the class of Lebesgue integrable functions includes the class of Riemann integrable functions.
An important consequence of Theorem 2 is that, similar to the Riemann integral, there exist bounded functions defined on a set of finite measure that are not Lebesgue integrable. To see this take $E$ as a non-measurable set and consider the function $\chi_E$.
You do raise an interesting question of why the Lebesgue integral is less impacted by the extent of discontinuity as long as we have measrability.
Next consider sets of infinite measure $E \subset \mathbb{R}$ and/or unbounded functions $f:E \to \mathbb{R}$.
Here we cannot even speak of Riemann integrals, yet the Lebesgue integral can be extended. First, we extend to nonnegative functions where the Lebesgue integral can be defined using the previous definition as the supremum of $\int_E g$ over all nonnegative, bounded, measurable functions $g$ with compact support in $E$. In this case the integral may take the value $+\infty$, so satisfaction of this definition alone does not mean that $F$ is Lebesgue integrable. For nonnegative $f$ to be Lebesgue integrable we must have $\int_E f < +\infty$.
The reason for this definition of Lebesgue integrability is to make it possible to extend the definition of the integral further to include general functions. In this case, we consider positive and negative parts $f^+$ and $f^-$ (which are themselves nonnegative functions) and define the Lebesgue integral as
$$\tag{*}\int_E f = \int_Ef^+ - \int_E f^-$$
Since $+\infty - +\infty$ cannot be defined in a meaningful way, this explains why Lebesgue integrability of a nonnegative functions stipulates that the Lebesgue integral is finite. Otherwise, (*) is not well defined. In this way, Lebesgue integrability of a general function $f$ implies that we also have
$$\int_E|f| = \int_Ef^+ + \int_E f^- < +\infty$$
Improper Riemann Integrals
In your question, you cite functions like $x \mapsto 1/x$ on $(0,1]$ and $x \mapsto 1/\sqrt{x}$ on $[1, \infty)$ as examples where the Lebesgue integral "fails". Needless to say, these functions are not Riemann integrable , but we can say that we have defined Lebesgue integrals
$$\int_{(0,1]} \frac{1}{x} = +\infty , \quad \int_{[1,\infty)} \frac{1}{\sqrt{x}} = +\infty$$
We just cannot say these functions are Lebesgue integrable as explained above.
Some of the deficiencies of the Riemann integral can be corrected by introducing the improper Riemann integral. We can even find examples where a function is improperly Riemann integrable but not Lebesgue integrable. Perhaps that should be considered as well in assessing the relative merits of Riemann and Lebesgue integration.