(1) Yes, the point spectrum is countable in your hypotheses: otherwise the operator would have an uncountable set of pairwise orthogonal vectors since eigenvectors of a self-adjoint operator with different eigenvalues are orthogonal.
This is impossible because, in every Hilbert space, every set of (normalized) orthogonal vectors can be completed into a Hilbert basis by Zorn's lemma and every Hilbert basis is countable if the space is separable.
(2) No, it is not necessarily true that the continuous spectrum is uncountable. You may have one only point in the continuous spectrum for instance. This is the case for the spectrum of the self-adjoint operator $1/H$, where $H$ is the Hamiltonian of the harmonic oscillator. The only point of the continuous spectrum is $0$.
COMMENT. However I do not believe that discrete is a really appropriate adjective for the point spectrum. My impression is that your idea of discrete involves the fact that the points are isolated. This is not the case for the point spectrum in general also if the Hilbert space is separable. You may have a point spectrum coinciding with rational numbers, which are dense in $\mathbb R$ as well known.
Indeed, there are other decompositions of the spectrum. Within a certain approach it is defined the so-called discrete spectrum as the part of point spectrum made of isolated eigenvalues whose eigenspaces are finite-dimensional.
If the Hilbert space is not separable, it is even possible to construct a self-adjoint operator whose point spectrum is the whole $\mathbb R$.
COMMENT 2. It is not necessary to introduce the notion of rigged Hilbert space to define notions of approximated eigenvalues and eigenvectors. Given a self-adjoint operator $A:D(A)\to H$ in the Hilbert space $H$, it is possible to prove that $\lambda \in \sigma_c(A)$ if and only if $\lambda$ is not an eigenvalue (in proper sense) and, for every $\epsilon>0$ there is $\psi_\epsilon \in D(A)$ with $||\psi_\epsilon||=1$ such that $||A\psi_\epsilon - \lambda \psi_\epsilon|| < \epsilon$.
The problem with the raised issue is that it is mainly formulated in the non-rigorous (but very effective) jargon of physicists. However, it can be re-formulated into a more mathematically precise version and the answer is straightforward.
First of all, if $A : D(A) \to H$ is selfadjoint, its spectum $\sigma(A)$ is the disjoint union of the continuous part of the spectrum $\sigma_c(A)$ and the point spectrum $\sigma_p(A)$. The latter is the set of eigenvectors.
Generally speaking, if $P: {\cal B}(\mathbb R) \to \mathfrak{B}(H)$ is a projection-valued measure defined on the Borel sets of the real line and $A,B$ are Borel set with $A \cap B =\emptyset$, we have $P_AP_B=0$ and
$$P_{A \cup B}= P_A + P_B\:.$$
Specializing to the case of $A= \sigma_p(A)$ and $B= \sigma_c(A)$ and $P=P^{(A)}$, the spectral measure of $A$, we conclude that:
$$P_{\sigma(A)}= P_{\sigma_p(A)} + P_{\sigma_c(A)}\:.$$
This identity propagates immediately to the spectral calculus giving rise to
$$A = \int_{\sigma_p(A)} \lambda dP^{(A)}(\lambda) + \int_{\sigma_c(A)} \lambda dP^{(A)}(\lambda)$$ where ($s-$ denoting the convergence in the strong topology)
$$\int_{\sigma_p(A)} \lambda dP^{(A)}(\lambda) = s-\sum_{u\in N} \lambda_u |u\rangle \langle u|$$ and $N$ is any Hilbert basis of the closed subspace $P^{(A)}_{\sigma_p(A)}(H)$ (viewed as a Hilbert space in its own right) made of eigenvectors $u$ with eigenvalues $\lambda_u$ (we may have $\lambda_u=\lambda_{u'}$ here).
It does not matter if $\lambda \in \sigma_p(A)$ is embedded in $\sigma_c(A)$, the above formulas are however valid. Everything is true when considering every Borel-measurable function $f$ of $A$:
$$f(A) = \int_{\sigma_p(A)} f(\lambda) dP^{(A)}(\lambda) + \int_{\sigma_c(A)} f(\lambda) dP^{(A)}(\lambda)$$ where
$$\int_{\sigma_p(A)} f(\lambda) dP^{(A)}(\lambda) = s-\sum_{u\in N} f(\lambda_u) |u\rangle \langle u|\:,$$
with the special case
$$I = \int_{\sigma_p(A)} dP^{(A)}(\lambda) + \int_{\sigma_c(A)} dP^{(A)}(\lambda)$$ where
$$\int_{\sigma_p(A)}dP^{(A)}(\lambda) = s-\sum_{u\in N} |u\rangle \langle u|$$
A mathematical remark helps understand why embedded eigenvalues do not create problems. The following facts are valid for $\lambda \in \mathbb{R}$.
(a) $\lambda \in \sigma_p(A)$ if and only if $P^{(A)}_{\{\lambda\}} \neq 0$,
(b) $\lambda \in \sigma_c(A)$ if and only if $P^{(A)}_{\{\lambda\}}= 0$, but
$P^{(A)}_{(\lambda-\delta, \lambda+\delta)}\neq 0$ for $\delta>0$
You see that the spectral measure $P^{(A)}$ cannot see the single points of $\sigma_c(A)$. And, if it sees a point, that point stays in $\sigma_p(A)$.
Furthermore, independently of the fact that some eigenvalues may be embedded in the continuous spectrum, $P^{(A)}_{\sigma_c(A)}(H)$ and
$P^{(A)}_{\sigma_p(A)}(H)$ are orthogonal subspaces of $H$.
ADDENDUM
A projector-valued measure (PVM) is a map $P : \Sigma(X) \to \mathfrak{B}(H)$ associating the sets $E$ of the $\sigma$-algebra $\Sigma(X)$ over $X$ to orthogonal projectors $P_E: H \to H$ on the Hilbert space $H$ such that
(1) $P(X)=I$,
(2) $P_EP_F = P_{E\cap F}$ if $E,F \in \Sigma(X)$,
(3) $P_{\cup_{j\in N} E_j}x = \sum_{n\in N}P_{E_n}x$ for every $x\in H$ and any family $\{E_n\}_{n\in N} \subset \Sigma(X)$ where $N \subset \mathbb{N}$ and $E_n \cap E_m = \emptyset$ if $n\neq m$.
(If $N$ is infinite, the sum refers to the topology of $H$.)
Given a PVM $P: \Sigma(X) \to \mathfrak{B}(H)$ and a pair of vectors $x,y\in H$, the map
$$\mu^{(P)}_{xy} : \Sigma(X) \ni E \mapsto \langle x| P_Ey\rangle \in \mathbb{C}$$
turns out to be a complex measure (with finite total variation). If $f:X \to \mathbb{C}$ is measurable, there exists an operator denoted by $$\int_X f dP : \Delta_f \to H$$ which is the unique operator such that
$$\left\langle x \left| \int_X f dP y \right. \right\rangle
= \int_X f(\lambda) d\mu^{(P)}_{xy}(\lambda)\quad x \in H\:, y \in \Delta_f\tag{1}$$
where the domain $\Delta_f$ is the dense linear subspace of $H$
$$\Delta_f := \left\{ y \in H \left| \int_X |f(\lambda)|^2 d\mu^{(P)}_{yy}(\lambda) < +\infty\right.\right\}\:.$$
With this definition, which is valid for every choice of $X$ and the $\sigma$-algebra $\Sigma(X)$, the spectral theorem for selfadjoint operators reads
THEOREM
If $A : D(A) \to H$ is a selfadjoint operator with domain $D(A) \subset H$, there exists a unique PVM $P^{(A)} : {\cal B}(\mathbb{R}) \to \mathfrak{B}(H)$, where ${\cal B}(\mathbb{R})$ is the Borel $\sigma$-algebra on $\mathbb{R}$, such that
$$\int_{\mathbb R} \lambda dP^{(A)}(\lambda) = A\:.$$
More precisely, the support of $P^{(A)}$ (i.e., the complement of the largest open set $O \subset \mathbb{R}$ with $P_O=0$) is exactly $\sigma(A)$.
Remarks
(1) With this result it is natural to define $$f(A) := \int_{\mathbb R} f(\lambda) dP^{(A)}(\lambda)=: \int_{\sigma(A)} f(\lambda) dP^{(A)}(\lambda)\:,$$
for every complex-valued Borel-measurable map defined on $\mathbb{R}$ (or more weakly on $\sigma(A)$).
(2)The above spectral theorem is still valid if replacing $A$ for a closed normal operator (in particular unitary operators). In that case $X= \mathbb{C}$. In the selfadjoint case $X = \mathbb{R}$ and, in this special case, the definition of a PVM can be given in terms of Stiltjes integrals, but this option is very limitative and for that reason it is rarely used nowadays.
(3) The notation $P(d\lambda)$ in place of $dP(\lambda)$ is a bit misleading (though the relevant object is the operator in the left hand side of Eq.(1) and not $dP$) since it refers to an apparently given measure $d\lambda$. But there is no such special choice in general. The standard measures arise once you give the vectors $x$ and $y$ and not before. In general there is no relation between $P^{(A)}$ and the Lebesgue measure $d\lambda$ on $\mathbb{R}$. Even if some forced (and sometime useful) relation can be found when comparing, e.g. $\langle x| P^{(A)}_E y \rangle$ and the Lebesgue measure $\int_E d\lambda$ and taking advantage of Lebesgue's theorem of decomposition of measures. This use leads to a further decomposition of the spectrum into absolutely continuous, singular continuous, and pure point parts. This decomposition, though useful in several contexts, is different from the original one into continuous, point, and residual which does not exploit the Lebesgue measure in any sense and that I used in my answer.
Unfortunately the sloppy language of physicists (I stress that I am a physicist!) does not permit to always appreciate some subtleties of these different decompositions and leads to misunderstandings and sometimes unmotivated issues. For instance the notions of discrete spectrum and point spectrum are different though they are often used as synonyms in the physical literature; the same happens for absolutely continuous spectrum and continuous spectrum.
Best Answer
The two types of spectra, $\sigma_c$ and $\sigma_p$, are disjont; so that the first statement you quoted is mathematically speaking wrong: $\sigma_c(A) \cap \sigma_p(A)=\emptyset$. But the book seems to be not very mathematically minded so a bit vague statements are definitely tolerable! Furthermore, one should take the fact into account that each element of the continuous spectrum has zero (spectral) measure (see (3) in the proposition below).
The second decomposition you quote is correct. However the discrete spectrum is made of isolated points whose eignespace is finite-dimensional. This second type of decomposition refers to the decomposition of the spectrum into discrete (the one said above) and and essential spectrum (its complement). A third decomposition you mention is the one into absolutely continuous spectrum and singular spectrum.
Notice that isolated points of the spectrum are always part of the point spectrum, but the vice versa is false. I do not understand well your remark about issues with integration for points in the discrete spectrum: see the last item of the following proposition where $ {\cal B}(\mathbb{R})$ is the $\sigma$-algebra of Borel sets in $\mathbb{R}$.
Proposition. Let $A: D(A) \to H$ be a selfadjoint operator and let $P: {\cal B}(\mathbb{R}) \to \mathfrak{B}(H)$ be the projection-valued measure (also known as spectral measure) of $A$ so that $$A = \int_{\mathbb{R}} \lambda dP(\lambda)\:.$$ The following facts are true for $\lambda \in \mathbb{R}$.
(1) The support of $P$ (i.e., the complement of the largest open set $A\subset \mathbb{R}$ such that $P_A=0$) is $\sigma(A)$.
(2) $\lambda \in \sigma_p(A)$ if and only if $P_{\{\lambda\}} \neq 0$. In that case $P_{\{\lambda\}}$ is the orthogonal projector onto the $\lambda$-eigenspace of $A$.
(3) $\lambda \in \sigma_c(A)$ if and only if both conditions are true: (a) $P_{\{\lambda\}}=0$ and (b) $P_{(a,b)}\neq 0$ if $a<b$ and $\lambda \in (a,b)$.
(4) If $\lambda$ is an isolated point of $\sigma(A)$ then necessarily $\lambda \in \sigma_p(A)$ and thus $P_{\{\lambda \}}\neq 0$.
(As references I could quote my books on spectral theory and QM books on spectral theory or also the more recent one.)
On account of the proposition, we can re-arrange the integral as $$A = \sum_{\lambda \in \sigma_p(A)} P_{\{\lambda\}} + \int_{\sigma_c(A)} \lambda dP(\lambda)\:.$$