(1) Yes, the point spectrum is countable in your hypotheses: otherwise the operator would have an **uncountable set of pairwise orthogonal vectors** since eigenvectors of a self-adjoint operator with different eigenvalues are orthogonal.
This is impossible because, in every Hilbert space, every set of (normalized) orthogonal vectors can be completed into a Hilbert basis by Zorn's lemma and every Hilbert basis is countable if the space is separable.

(2) No, it is not necessarily true that the continuous spectrum is uncountable. You may have one only point in the continuous spectrum for instance. This is the case for the spectrum of the self-adjoint operator $1/H$, where $H$ is the Hamiltonian of the harmonic oscillator. The only point of the continuous spectrum is $0$.

**COMMENT**. However I do not believe that *discrete* is a really appropriate adjective for the point spectrum. My impression is that your idea of *discrete* involves the fact that the points are isolated. This is not the case for the point spectrum in general also if the Hilbert space is separable. You may have a point spectrum coinciding with rational numbers, which are dense in $\mathbb R$ as well known.

Indeed, there are other decompositions of the spectrum. Within a certain approach it is defined the so-called **discrete spectrum** as the part of point spectrum made of isolated eigenvalues whose eigenspaces are finite-dimensional.

If the Hilbert space is not separable, it is even possible to construct a self-adjoint operator whose **point spectrum** is the whole $\mathbb R$.

**COMMENT 2**. It is not necessary to introduce the notion of *rigged Hilbert space* to define notions of approximated eigenvalues and eigenvectors. Given a self-adjoint operator $A:D(A)\to H$ in the Hilbert space $H$, it is possible to prove that $\lambda \in \sigma_c(A)$ if and only if $\lambda$ is not an eigenvalue (in proper sense) and, for every $\epsilon>0$ there is $\psi_\epsilon \in D(A)$ with $||\psi_\epsilon||=1$ such that $||A\psi_\epsilon - \lambda \psi_\epsilon|| < \epsilon$.

The problem with the raised issue is that it is mainly formulated in the non-rigorous (but very effective) jargon of physicists. However, it can be re-formulated into a more mathematically precise version and the answer is straightforward.

First of all, if $A : D(A) \to H$ is selfadjoint, its spectum $\sigma(A)$ is the **disjoint** union of the continuous part of the spectrum $\sigma_c(A)$ and the point spectrum $\sigma_p(A)$. The latter is the set of eigenvectors.

Generally speaking, if $P: {\cal B}(\mathbb R) \to \mathfrak{B}(H)$ is a projection-valued measure defined on the Borel sets of the real line and $A,B$ are Borel set with $A \cap B =\emptyset$, we have $P_AP_B=0$ and
$$P_{A \cup B}= P_A + P_B\:.$$

Specializing to the case of $A= \sigma_p(A)$ and $B= \sigma_c(A)$ and $P=P^{(A)}$, the spectral measure of $A$, we conclude that:
$$P_{\sigma(A)}= P_{\sigma_p(A)} + P_{\sigma_c(A)}\:.$$
This identity propagates immediately to the spectral calculus giving rise to
$$A = \int_{\sigma_p(A)} \lambda dP^{(A)}(\lambda) + \int_{\sigma_c(A)} \lambda dP^{(A)}(\lambda)$$ where ($s-$ denoting the convergence in the strong topology)
$$\int_{\sigma_p(A)} \lambda dP^{(A)}(\lambda) = s-\sum_{u\in N} \lambda_u |u\rangle \langle u|$$ and $N$ is any Hilbert basis of the closed subspace $P^{(A)}_{\sigma_p(A)}(H)$ (viewed as a Hilbert space in its own right) made of eigenvectors $u$ with eigenvalues $\lambda_u$ (we may have $\lambda_u=\lambda_{u'}$ here).
It does not matter if $\lambda \in \sigma_p(A)$ is embedded in $\sigma_c(A)$, the above formulas are however valid. Everything is true when considering every Borel-measurable function $f$ of $A$:
$$f(A) = \int_{\sigma_p(A)} f(\lambda) dP^{(A)}(\lambda) + \int_{\sigma_c(A)} f(\lambda) dP^{(A)}(\lambda)$$ where
$$\int_{\sigma_p(A)} f(\lambda) dP^{(A)}(\lambda) = s-\sum_{u\in N} f(\lambda_u) |u\rangle \langle u|\:,$$
with the special case
$$I = \int_{\sigma_p(A)} dP^{(A)}(\lambda) + \int_{\sigma_c(A)} dP^{(A)}(\lambda)$$ where
$$\int_{\sigma_p(A)}dP^{(A)}(\lambda) = s-\sum_{u\in N} |u\rangle \langle u|$$

A mathematical remark helps understand why embedded eigenvalues do not create problems. The following facts are valid for $\lambda \in \mathbb{R}$.

(a) $\lambda \in \sigma_p(A)$ if and only if $P^{(A)}_{\{\lambda\}} \neq 0$,

(b) $\lambda \in \sigma_c(A)$ if and only if $P^{(A)}_{\{\lambda\}}= 0$, but
$P^{(A)}_{(\lambda-\delta, \lambda+\delta)}\neq 0$ for $\delta>0$

You see that the spectral measure $P^{(A)}$ **cannot see the single points of $\sigma_c(A)$. And, if it sees a point, that point stays in $\sigma_p(A)$.**

Furthermore, independently of the fact that some eigenvalues may be embedded in the continuous spectrum, $P^{(A)}_{\sigma_c(A)}(H)$ and
$P^{(A)}_{\sigma_p(A)}(H)$ are orthogonal subspaces of $H$.

**ADDENDUM**

A **projector-valued measure** (PVM) is a map $P : \Sigma(X) \to \mathfrak{B}(H)$ associating the sets $E$ of the $\sigma$-algebra $\Sigma(X)$ over $X$ to orthogonal projectors $P_E: H \to H$ on the Hilbert space $H$ such that

(1) $P(X)=I$,

(2) $P_EP_F = P_{E\cap F}$ if $E,F \in \Sigma(X)$,

(3) $P_{\cup_{j\in N} E_j}x = \sum_{n\in N}P_{E_n}x$ for every $x\in H$ and any family $\{E_n\}_{n\in N} \subset \Sigma(X)$ where $N \subset \mathbb{N}$ and $E_n \cap E_m = \emptyset$ if $n\neq m$.
(If $N$ is infinite, the sum refers to the topology of $H$.)

Given a PVM $P: \Sigma(X) \to \mathfrak{B}(H)$ and a pair of vectors $x,y\in H$, the map
$$\mu^{(P)}_{xy} : \Sigma(X) \ni E \mapsto \langle x| P_Ey\rangle \in \mathbb{C}$$
turns out to be a complex measure (with finite total variation). If $f:X \to \mathbb{C}$ is measurable, there exists an operator denoted by $$\int_X f dP : \Delta_f \to H$$ which is the unique operator such that
$$\left\langle x \left| \int_X f dP y \right. \right\rangle
= \int_X f(\lambda) d\mu^{(P)}_{xy}(\lambda)\quad x \in H\:, y \in \Delta_f\tag{1}$$
where the domain $\Delta_f$ is the dense linear subspace of $H$
$$\Delta_f := \left\{ y \in H \left| \int_X |f(\lambda)|^2 d\mu^{(P)}_{yy}(\lambda) < +\infty\right.\right\}\:.$$
With this definition, which is valid for every choice of $X$ and the $\sigma$-algebra $\Sigma(X)$, the spectral theorem for selfadjoint operators reads

**THEOREM**
*If $A : D(A) \to H$ is a selfadjoint operator with domain $D(A) \subset H$, there exists a unique PVM $P^{(A)} : {\cal B}(\mathbb{R}) \to \mathfrak{B}(H)$, where ${\cal B}(\mathbb{R})$ is the Borel $\sigma$-algebra on $\mathbb{R}$, such that
$$\int_{\mathbb R} \lambda dP^{(A)}(\lambda) = A\:.$$
More precisely, the support of $P^{(A)}$ (i.e., the complement of the largest open set $O \subset \mathbb{R}$ with $P_O=0$) is exactly $\sigma(A)$.*

**Remarks**

(1) With this result it is natural to define $$f(A) := \int_{\mathbb R} f(\lambda) dP^{(A)}(\lambda)=: \int_{\sigma(A)} f(\lambda) dP^{(A)}(\lambda)\:,$$
for every complex-valued Borel-measurable map defined on $\mathbb{R}$ (or more weakly on $\sigma(A)$).

(2)The above spectral theorem is still valid if replacing $A$ for a closed normal operator (in particular unitary operators). In that case $X= \mathbb{C}$. In the selfadjoint case $X = \mathbb{R}$ and, in this special case, the definition of a PVM can be given in terms of Stiltjes integrals, but this option is very limitative and for that reason it is rarely used nowadays.

(3) The notation $P(d\lambda)$ in place of $dP(\lambda)$ is a bit misleading (though the relevant object is the operator in the left hand side of Eq.(1) and not $dP$) since it refers to an apparently given measure $d\lambda$. But there is no such special choice in general. The standard measures arise once you give the vectors $x$ and $y$ and not before. In general there is no relation between $P^{(A)}$ and the Lebesgue measure $d\lambda$ on $\mathbb{R}$. Even if some **forced** (and sometime useful) relation can be found when comparing, e.g. $\langle x| P^{(A)}_E y \rangle$ and the Lebesgue measure $\int_E d\lambda$ and taking advantage of Lebesgue's theorem of decomposition of measures. This use leads to a further decomposition of the spectrum into absolutely continuous, singular continuous, and pure point parts. This decomposition, though useful in several contexts, is different from the original one into continuous, point, and residual which does not exploit the Lebesgue measure in any sense and that I used in my answer.
Unfortunately the sloppy language of physicists (I stress that *I am* a physicist!) does not permit to always appreciate some subtleties of these different decompositions and leads to misunderstandings and sometimes unmotivated issues. For instance the notions of *discrete spectrum* and *point spectrum* are *different* though they are often used as synonyms in the physical literature; the same happens for *absolutely continuous spectrum* and *continuous spectrum*.

## Best Answer

Sufficient conditions for a discrete energy spectrum:

Sturm-Louville theoryimplies that the spectrum is purely discrete provided that the system is restricted to a finite interval [a,b] (with appropriate boundary conditions). I assume (but don't know) this can be extended to the case of a system restricted to finite volume in higher dimensions.The next two conditions I found in this paper:

Cwickel-Lieb-Rosenbljum boundimplies that the spectrum will be discrete for energies for which the classical phase-space volume of $\{(\vec{x},\vec{p})|\vec{p}^2+V(\vec{x})<E\}$ is finite.Golden-Thompson inequalitythe spectrum will be purely discrete in any system where the classical partition function is finite (at some value of temperature)Conditions for the spectrum to have a continuous part or to be purely continuous seem a lot more complicated. The main purpose of the paper cited above was to provide a counter-example to the 'rule-of-thumb' I gave in the question. Namely, it shows that the Hamiltonian $$ H = - \frac{\partial^2}{\partial x^2} -\frac{\partial^2}{\partial y^2} + x^2 y^2 $$ has a purely discrete spectrum even though the volume of $\{\vec{x}|V(\vec{x})<E \}$ is infinite for any $E>0$.

A reference suggested by yuggib (thanks!), and references therein give some more general criteria for continuous spectra. It looks like a difficult problem and research on classes of Hamiltonian with continous spectra seems to be ongoing. Here are a few results:

Consider the Hamiltonian $$ H = -\nabla^2 +V(\vec{x}) $$ and suppose that $|V(\vec{x})|<C(1+|\vec{x}|)^{-\alpha}$ for some constants $C$ and $\alpha>0$ (i.e. the potential decays to zero at infinity). Then:

Wigner von-Neumannpotentials (e.g. $V(r) \sim \frac{\sin 2 r}{r}$ which has a discrete Eigenvalue at $E=1$).