The 3D case is different than the 1D case, so don't feel too committed to the analogy. However, in the 1D case, you get only sine functions as solutions if you place a "hard wall" (an infinite potential barrier) at $x=0$. That is basically what is happening here: the origin is a sort of "hard" boundary for the radial equation since "negative radius" makes no sense.
Furthermore, the cosine-like solution diverges as $\frac{1}{r}$ at $r=0$ which does not satisfy Schroedinger's Equation. More precisely, a $\frac{1}{r}$-type solution is normalizable, but $\nabla^2(\frac{1}{r})=-4\pi\delta(r)$ and to satisfy Schrödinger's Equation in this case we need $\nabla^2(\frac{1}{r})=\frac{E}{r}$.
Finally, there is a solution to the equation outside of the sphere, too. It is (as you point out) exponentially decaying. That solution has an undetermined constant as well, so you need two boundary conditions to find the two undetermined constants: the one inside and the one outside.
Edit:Since there is not enough room in the comments to adequately show my point about the two boundary conditions, I'm replying here.
Yes, you are correct that with just the 2 conditions (normalization and continuity) you can get a reasonable solution, but you cannot get a correct solution. This is because normalization is a requirement we impose, but continuity and smoothness are mathematical requirements of the system. I could just as easily require that $\psi(3)=16$ and it would have just as much mathematical relevance to the solution of the problem as normalization does.
However, it is true that normalization is a restriction on the solutions of the system; therefore, this system is actually over-specified and only admits a countable number of solutions (a hallmark of any Sturm-Liouville Theory). In other words, the normalization condition is important, but it does not negate the other, rigorous mathematical requirements of the problem (namely, smoothness). In what follows I try to explain why that is the case in a mathematically rigorous way.
(Also, I want to point out that in the 1D case you don't just have continuity and smoothness at one boundary: you have them at 2 boundaries. You actually have 4 equations with 4 unknowns in that case - since there will be a decaying solution at the other side of the well, too. So don't confuse 1D and 3D, they only correspond here when there is a "hard wall" at $r=0$ (in which case you will only have 1 unknown inside the well since cosine solutions are not admitted).)
First, since you are happy to accept that the wave function must be continuous at $r=a$, I will prove that that implies that the wave function is also smooth at $r=a$ (even in the 3D case). Second, I will show that with the smoothness condition we recover a discrete spectrum of bound states (as expected). Third, I will prove that without the smoothness condition we recover a continuum of bound states. Therefore, by showing that smoothness is required and that it implies a discrete spectrum, I will show that solving the problem without considering smoothness is incorrect because it does not give the same results as the mathematically rigorous method.
First, since we believe that the function is continuous (but possibly not differentiable) at $r=a$, let's solve the following integral equation:
$$
\lim_{\epsilon\rightarrow 0}\Big[\int_{a-\epsilon}^{a+\epsilon}\frac{-\hbar^2}{2m}\frac{d^2\psi}{dr^2}dr\Big]+\lim_{\epsilon\rightarrow 0}\Big[\int_{a-\epsilon}^{a+\epsilon}V(r)\psi(r)dr\Big]=\lim_{\epsilon\rightarrow 0}\Big[\int_{a-\epsilon}^{a+\epsilon}E\psi(r)dr\Big]
$$
Basically, I'm just integrating the Schrödinger's Equation for this problem in a small region around the point where we expect the wave function to (possibly) have a cusp. Here, $V(r)=0$ (when $r<a$) or $V(r)=V_0$ (when $r>a$). That means that
$$
\lim_{\epsilon\rightarrow 0}\Big[\int_{a-\epsilon}^{a+\epsilon}V(r)\psi(r)dr\Big]=\lim_{\epsilon\rightarrow 0}\Big[\int_{a}^{a+\epsilon}V_0\psi(r)dr\Big]=0
$$
Solving the equations above gives:
$$
\frac{-\hbar^2}{2m}\Big(\frac{d\psi}{dr}\Big|_{r=a^+}-\frac{d\psi}{dr}\Big|_{r=a^-}\Big)=0 \\
\therefore \ \frac{d\psi}{dr}\Big|_{r=a^+}=\frac{d\psi}{dr}\Big|_{r=a^-}
$$
And that final equation is exactly the definition of smoothness.
Now, let's take the 3D problem for $l=0$ (which is the case that you ask about in your question) and solve it:
$$
\frac{-\hbar^2}{2m}\frac{d^2\psi_1}{dr^2}=E\psi_1 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (\text{for r<a)}\\
\frac{-\hbar^2}{2m}\frac{d^2\psi_2}{dr^2}+V_0\psi_2=E\psi_2 \ \ \ \text{(for r>a)}
$$
Let's define $k_1=\frac{\sqrt{2mE}}{\hbar}$ and $k_2=\frac{\sqrt{2m(V_0-E)}}{\hbar}$. Therefore,
$$
\frac{d^2\psi_1}{dr^2}=-k_1^2\psi_1 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (\text{for r<a)}\\
\frac{d^2\psi_2}{dr^2}=k_2^2\psi_2 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{(for r>a)}
$$
Let's assume that $E<V_0$ so that we get exponential solutions when $r>a$ instead of oscillatory solutions. When $E>V_0$, our solutions are not bound states and do not decay exponentially as $r\rightarrow\infty$. Solving these equations gives
$$
\psi_1(r)=A\sin(k_1r)+B\cos(k_1r) \\
\psi_2(r)=Ce^{k_2r}+De^{-k_2r}
$$
Now we must apply our boundary conditions. First, since we have an infinite potential barrier at $r=0$, $\psi_1(0)=0$; therefore, $B=0$. Next, by normalizability,we must require that $\psi_2(\infty)=0$; therefore, $C=0$. This leaves just
$$
\psi_1(r)=A\sin(k_1r) \\
\psi_2(r)=De^{-k_2r}
$$
Now, we apply our continuity condition at $r=a$:
$$
A\sin(k_1a)=De^{-k_2a} \\
\therefore \frac{A}{D}=\frac{e^{-k_2a}}{\sin(k_1a)}
$$
Now, if you want to uniquely solve the problem, you must apply smoothness at $r=a$, which will give you
$$
k_1A\cos(k_1a)=-k_2De^{-k_2a} \\
\therefore \tan(k_1a)=-\frac{k_1}{k_2}
$$
That last equation is a transcendental equation which allows us to solve for the energy levels of the bound-states of the problem. Here is a WolframAlpha plot of the 2 sides of that equation when $a=10$ and $V_0=50$. (http://www.wolframalpha.com/input/?i=Plot[{Tan[Sqrt[x]*10]%2C+-Sqrt[x%2F%2850-x%29]}%2C{x%2C0%2C50}]) As you can see, there are only a finite number of intersections between the two graphs, and therefore, there are only a finite number of permissible energy values (i.e. the energy spectrum is discrete).
Finally, let's take a look at trying to solve the problem using only continuity and normalization.
$$
A\sin(k_1a)=De^{-k_2a} \\
1=\int^a_0A^2\sin^2(k_1r)dr+\int^\infty_aD^2e^{-2K_2r}dr \\
\therefore \ A=\frac{1}{\sqrt{\frac{a}{2}-\frac{\sin(2k_1a)}{4k_1}+\frac{e^{-k_2a}\sin(k_1a)}{2k_2}}} \\
\ \ \ D=\frac{e^{k_2a}\sin(k_1a)}{\sqrt{\frac{a}{2}-\frac{\sin(2k_1a)}{4k_1}+\frac{e^{-k_2a}\sin(k_1a)}{2k_2}}}
$$
As you can see, there is no restriction on what values $k_1$ and $k_2$ can take on. Any value between $0$ and $V_0$ gives a sensible answer.
Therefore, we must conclude that ignoring the smoothness condition at the boundary in the 3D, $l=0$ case is unacceptable because doing so gives a fundamentally different answer than the one derived when smoothness was not ignored. Moreover, smoothness is required of the problem, and that fact is mathematically derivable (as shown above).
Best Answer
The physical idea is that you'll let $a$ go to infinity for a truly free particle, and if you take this limit, then the specific details of the boundary conditions should be irrelevant, because the boundaries are so far away anyway.
Therefore, you are welcome to choose convenient boundary conditions, and the periodic ones are convenient, because then you have just plain waves $e^{ikx}$, with the admitted $k$-values determined by $e^{ika} = 1$, so $ka = 2\pi n$, and $n \in \mathbb{Z}$.