Indeed, the quoted explanation is incorrect: the singularity of $y''$ would be of higher order (a distribution of order $1$, to be precise), and there is no such thing in the equation.
The question of "why must the solution be continuous" is always tied to "what do we mean by solution"? For that matter, what does the equation $y''−4y=\delta (x−a)$ mean? What is $y''$ here? The equation cannot be understood in terms of classical derivatives (limit of difference quotient). If $y''$ is the classical 2nd order derivative of $y$, then it is a function, and being added to $-4y$, it is still a function and not a point mass.
One of possible answers is: $y$ is a distribution such that $y''$ is a Radon measure, and the sum of two measures $y''$ and $-4y(t)\,dt$ is the point mass at $a$. But if $y''$ is a Radon measure, then $y'$ is a function of bounded variation on every bounded interval. It follows that $y$ is a Lipschitz function on every bounded interval, and in particular it is continuous.
Update. One possible answer is: $y$ must be continuous because we say so. It's up to us to decide what we accept as a solution to any problem. For this problem, we decide that $y$ is a solution if it is continuous everywhere, has derivatives $y',y''$ when $x\ne a$, and these derivatives satisfy equation. If such continuous $y$ did not exist, we would then say that the equation has no solution.
I do not have a satisfactory explanation in terms that you were given. The following is just an attempt to connect the story somehow. Consider the approximate equation where $\delta(x-a)$ is replaced with $D(x-a,h)$. Following the method of doraemonpaul, we find
$$y=C_1e^{2x}+C_2e^{-2x}+\dfrac{1}{4} \int_a^x \left(e^{2(x-t)}-e^{2(t-x)}\right) D(t-a,h)\,dt$$
As $h\to 0$, $D$ grows large in the sense of pointwise values, but its integral stays equal to $1$. This suggests integration by parts: we throw the integral onto $D(t-a,h)$, turning it into something similar to the Heaviside function, let's say $H(t-a,h)$.
$$y=C_1e^{2x}+C_2e^{-2x}+\dfrac{1}{2} \int_a^x \left(e^{2(x-t)}+e^{2(t-x)}\right) H(t-a,h)\,dt$$
This is less convenient for practical computation than the original integral, but the advantage is that we have a more reasonable function $H(t-a,h)$ which converges to a reasonable function $H(t-a)$. In particular, $H$ stays bounded as $h\to 0$, which means that the differences such as $|y(x)-y(x')| $ can be effectively estimated from above:
$$\left|\int_x^{x'} \left(e^{2(x-t)}+e^{2(t-x)}\right) H(t-a,h)\,dt\right|\le M|x-x'|\tag{L}$$
where $M$ is the supremum of whatever is under the integral sign. This (L) is the Lipschitz property that I was talking about in the comments, but you can read it simply as continuity.
Set $z=\partial_x y$. Then $z=z(x)$ satisfies the equation
$$
\partial_x z+\delta(x)\,z=0,
$$
with solution
$$
z(x)=z(0)\,\mathrm{e}^{-\int_0^x\delta(\xi)\,d\xi}.
$$
Hence, in order to obtain $y=y(x)$ you should integrate $z$ once.
Best Answer
This is a good example of a question to which one can answer at some very different levels of mathematical sophistication... Since you say nothing about this, let me try an elementary approach.
What you call the Dirac delta function (which is not a function, at least not in the sense of a function from $\mathbb R$ to $\mathbb R$) is a strange object but something about it is clear:
We will not use anything else about the Dirac $\delta$.
If one also asks that $\displaystyle\int_y^zu''(x)\mathrm dx=u'(z)-u'(y)$ for every $y\leqslant z$, one can integrate once your equation $\color{red}{-u''=\delta}$, getting that there exists $a$ such that $$ u'(x)=a-[x\geqslant0], $$ where we used Iverson bracket notation. Now let us integrate this once again.
Using the facts that $\displaystyle\int_y^zu'(x)\mathrm dx$ should be $u(z)-u(y)$ for every $y\leqslant z$, and the value of $\displaystyle\int_y^z[x\geqslant0]\mathrm dx$, one gets that for every fixed negative number $x_0$, $$ u(x)=u(x_0)+a\cdot (x-x_0)-x\cdot[x\geqslant0]. $$ This means that $b=u(x_0)-a\cdot x_0$ does not depend on $x_0<0$, hence finally, for every $x$ in $\mathbb R$, $$ \color{red}{u(x)=a\cdot x+b-x\cdot[x\geqslant0]}. $$ (And, in the present case, the condition that $u(-2)=u(3)=0$ imposes that $a=3/5$ and $b=6/5$.)
This is the general solution of the equation $-u''=\delta$. Note that every solution $u$ is $C^\infty$ on $\mathbb R\setminus\{0\}$ but only $C^0$ at $0$ hence $u'$ and $u''$ do not exist in the rigorous sense usually meant in mathematics. Note finally that $u$ is also $$ u(x)=a\cdot x+b-x\cdot[x\gt0]. $$