Note: My approach lacks rigour and I'm rusty. I'm sure there are lots of holes, overly-complicated parts, and perhaps even just outright mistakes. However, I think the gist of the argument motivates why two operators which commute leads one to consider solutions of the form you mention.
Paraphrased: Why does does the fact that $\hat{A}$ and $\hat{B}$ commute in the Hamiltonian $\hat{H}=\hat{A} + \hat{B}$ mean that we can write the solution as factors of the solutions of $\hat{A}$ and $\hat{B}$?
Essentially, the fact that the Hamiltonian's parts, $\hat{A}$ and $\hat{B}$, commute means we can separate the general solution (of the form $\exp(-i\hat{H})$ into two exponential operators.
My approach lacks rigour, but I think it shows the idea:
For the equation
$\newcommand{\Ket}[1]{\left|#1\right>}$
$\newcommand{\Bra}[1]{\left<#1\right|}$
$\newcommand{\Braket}[2]{\left<{#1}|{#2}\right>}$
\begin{equation}
\hat{H}\Ket{\psi} = i\frac{\partial}{\partial t}\Ket{\psi} \tag{1}\label{SE}
\end{equation}
the general solution is known to be
\begin{align}
\Ket{\psi(t)} &= \exp\left(-it\hat{H} \right) \Ket{\psi(0)} \\
&= \exp\left(-it(\hat{A}+\hat{B}) \right) \Ket{\psi(0)}
\tag{2}
\end{align}
You can see this through substitution or look it up in any QM textbook or course notes.
The fact that $[A,B]=0$ means we can split the exponential operator, giving
\begin{align}
\Ket{\psi(t)} &= \exp\left(-it\hat{H} \right) \Ket{\psi(0)} \\
&= \exp\left(-it\hat{A} \right)\exp\left(-it\hat{B} \right) \Ket{\psi(0)}
\tag{3}
\end{align}
Let's try to write this in position representation.
First, let's recognise that we're dealing with more than one spatial coordinate and each operator is acting in mutually exclusive vector spaces, so that $\mathcal{H} = \mathcal{H}_A \otimes \mathcal{H}_B$. Alternatively, in position representation, we can see that $\hat{A}$ has an $x$-dependence and $\hat{B}$ has a $y$-dependence (and can be seen to commute by observation).
Also, the time-independent solutions for $\hat{A}$ and $\hat{B}$ separately in position representation are
\begin{align}
f(x) &= \Braket{x}{f} = \sum_i\lambda_i\Braket{x}{a_i} \\
g(y) &= \Braket{y}{g} = \sum_i\mu_i\Braket{x}{b_i}
\tag{4}
\end{align}
where the eigenfunctions are
\begin{align}
\hat{A}\Ket{a_i} &= a_i \Ket{a_i} \\
\hat{B}\Ket{b_i} &= b_i \Ket{b_i}
\tag{5}
\end{align}
Then using $\Braket{x,y}{\psi} = \psi(x,y)$ with $\Bra{x,y} = \Bra{x}\otimes\Bra{y}$, we get
\begin{align}
\Braket{x,y}{\psi} =& \left[\Bra{x}\exp\left(-it\hat{A} \right) \otimes \Bra{y}\exp\left(-it\hat{B} \right)\right] \Ket{\psi(0)}
\tag{6}
\end{align}
which we can write using the expansion of the identity as
\begin{align}
\Braket{x,y}{\psi} =& \left[\sum_n\Bra{x}\exp\left(-it\hat{A} \right) \Ket{a_n}\Bra{a_n} \right. \\
&\otimes \left. \sum_m\Bra{y}\exp\left(-it\hat{B} \right)\Ket{b_m}\Bra{b_m}\right]\psi(0)
\tag{7}
\end{align}
where the operators can now act on their respective eigenstates, giving
\begin{align}
\Braket{x,y}{\psi(t)} =& \sum_n\Braket{x}{a_n}\exp\left(-ita_n \right)\sum_m\Braket{y}{b_m}\exp\left(-itb_m \right) \\ &\int\int dx^\prime dy^\prime\Braket{a_n}{x^\prime}\Braket{b_m}{y^\prime} \Braket{x^\prime,y^\prime}{\psi(0)}
\tag{8}\label{FINALEQ}
\end{align}
where I've simplified the expression using $$\left(\Bra{x^\prime}\otimes\Bra{y^\prime}\right)\Ket{\psi(0)} = \Braket{x^\prime,y^\prime}{\psi(0)} \tag{9}$$
Note that $\Braket{x}{a_n}$ are the eigenfunctions of $\hat{A}$ and $\Braket{y}{a_m}$ are the eigenfunctions of $\hat{A}$ in position representation.
So, we can see that $\ref{FINALEQ}$ is simply a linear combination of factors of eigenstates of the individual operators $\hat{A}$ and $\hat{B}$:
\begin{align}
\Braket{x,y}{\psi(t)} =& \sum_{m,n}c_{m,n}f_n(x,t)g_m(y,t) \tag{10}
\end{align}
This is a general solution.
Ignoring the time-dependence, we can see that just one of those solutions (not constrained by any initial conditions) is $f_n(x,t)g_m(y,t) = \Braket{x}{a}\Braket{y}{b}$ i.e. the multiplication of the eigenstates of the operators separately. Generally, we can consider a linear combination of such and it's still a solution, and in this case, we've worked backwards from that.
I think this is also shows (maybe a bit strong given the lack of rigour... "suggests"?) that the independence of the variables leads us to consider states in different subspaces of $\mathcal{H}$, and hence operators in those different spaces commute, and therefore the general solution in position representation is formed of products of the separate solutions to those subspaces.
It doesn't matter that we're talking about $r$, $\phi$, and $\theta$, or something simpler like $x$ and $y$ - they're just specific representations of the fundamental states. Therefore, we can propose solutions of the form $\psi(x,y) = f(x)g(y)$.
Now, knowing this, whenever we see a Hamiltonian where $H=H_1 + H_2$, and the parts commute, we can jump straight to the conclusion and state $\psi(x,y) = f(x)g(y)$, where $f(x)$ and $g(y)$ are the solutions, and then immediately go about calculating $f(x)$ and $g(y)$.
There are several possible more or less elaborated answers, I will discuss here a version which may be of interest, I hope, because it can be extended to other cases. It happens because this approach relies upon general (though elementary) abstract results of Hilbert space theory.
Let $d\Omega$ be the unique rotationally invariant (Borel) measure on the two-dimensional sphere of unit radius $S^2$ such that $\int_{S^2} 1 d\Omega = 4\pi$.
We have an Hilbert space isomorphism,
$$L^2(\mathbb{R}^3, dx^3) \equiv L^2(S^2, d\Omega)\otimes L^2([0,+\infty), r^2dr)\tag{1}$$ where the isomorphism is the unique linear and bounded extension of the map
$$L^2(S^2, d\Omega)\otimes L^2([0,+\infty), r^2dr) \ni Y\otimes f \mapsto Y \cdot f \in L^2(\mathbb{R}^3, d^3x)$$
where
$$(Y \cdot f)(\theta, \phi, r) := Y(\theta,\phi) f(r)\:.$$
That is a special case of the general result
$$ L^2(X, dx) \otimes L^2(Y, dy)\equiv L^2(X\times Y, dx\otimes dy) \tag{2}$$
where the isomorphism
is again the unique linear and bounded extension of the map
$$L^2(X,dx)\otimes L^2(Y,dy) \ni f\otimes g \mapsto f \cdot g \in L^2(X\times Y, dx\otimes dy)$$
In the case (1), $\mathbb{R}^3 = S^2 \times [0,+\infty)$ and $d^3x= d\Omega \otimes r^2 dr$ as is well known passing from Cartesian to polar coordinates.
Let us focus attention on (1). If one writes down the explicit expression of the three selfadjoint angular momentum operators $L_x,L_y,L_z$ in $L^2(\mathbb{R}^3, dx^3)\equiv L^2(S^2, d\Omega)\otimes L^2([0,+\infty), r^2dr)$ he/she discovers the following interesting fact.
$$L_k = {\cal L}_k \otimes I_{L^2([0,+\infty), r^2dr)}\:,\tag{3}$$
for some operators ${\cal L}_k$ which are differential operators on the dense subspace of smooth (complex valued) functions in $L^2(S^2, d\Omega)$.
For instance
$${\cal L}_z = - i \hbar \frac{\partial}{\partial \phi}$$ and similar identities are valid for the other components, where only derivatives with respect to the angles $\phi$ and $\theta$ take place: these are the variables used to define wavefunctions in $L^2(S^2, d\Omega)$.
From the general theory of strongly continuous unitary representations of compact topological groups, in this case $SU(2)$, as the three $-i{\cal L}_k$ define a representation of its Lie algebra, one can construct a Hilbert basis of
simultaneous eigenvectors of ${\cal L}^2$ and ${\cal L}_z$ in $ L^2(S^2, d\Omega)$. This basis is the very well known family of spherical harmonics $$Y^\ell_m(\theta,\phi).$$
Notice that the coordinate $r$ does not appear just because we are working in a Hilbert space where it does not exist!
Let us conclude. As a general result related with (2), if $L^2(X\times Y, dx\otimes dy)$ is a Hilbert space and $\{Y_a\}_{a\in A} \subset L^2(X, dx)$ is a Hilbert basis
whereas $\{R_b\}_{b\in B} \subset L^2(Y, dy)$ is another Hilbert basis, then
$$\{Y_a \otimes R_b\}_{(a,b) \in A\times B} \subset L^2(X, dx) \otimes L^2(Y, dy)\equiv L^2(X\times Y, dx\otimes dy)$$
is a Hilbert basis as well for $ L^2(X\times Y, dx\otimes dy)$.
Specializing the result to (1), we see that a Hilbert basis of $L^2(\mathbb{R}^3, d^3x)$ is made of the products $$Y^\ell_m(\theta, \phi) f_n(r)$$ where $\{f_n\}_{n \in N}$ is any given Hilbert basis of $L^2([0,+\infty). r^2dr)$.
Notice that, in view of (3),
$$L^2 (Y^\ell_m f_n) = ({\cal L}^2 \otimes I) Y^\ell_m \otimes f_n = \hbar^2 \ell(\ell+1) (Y^\ell_m f_n)$$
and
$$L_z (Y^\ell_m f_n) = ({\cal L}_z \otimes I) Y^\ell_m \otimes f_n = \hbar m (Y^\ell_m f_n)$$
More generally, with the same argument,
$$L^2 (Y^\ell_m f) = \hbar^2 \ell(\ell+1) (Y^\ell_m f)$$
and
$$L_z (Y^\ell_m f) = \hbar m (Y^\ell_m f)$$
for every $f\in L^2([0,+\infty), r^2dr)$.
In summary, the functions $Y^\ell_m(\theta,\phi)f_n(r)$ form a symultaneous Hilbert basis of eigenvectors of $L^2$ and $L_z$ nomatter the choice of the Hilbert basis $\{f_n\}_{n\in N}$ in $L^2([0,+\infty), r^2 dr)$.
With a slightly more elaborated argument, it is possible to choice the functions $f_n$ also depending on $\ell$ and possibly $m$, changing the original functions $f_n$ in every eigenspace of $L^2$ and $L_z$.
Best Answer
In short: In the same way that you think about plane waves.
Spherical harmonics (just like plane waves) are basic, essential tools. As such, they are used over a very wide variety of contexts, and each of those contexts will end up using a different "way to think" about the tools: using different properties of the tool, putting it in different frameworks, and so on.
In that regard, "how we should think" about the tool depends on what we want to do with it and what we already know about the context. This is ultimately a personal question that is up to you, and you haven't really given us much to go on to say anything useful.
That said, you seem to be confused about the relationship between the statements
and
As I said above, you should think of these in the same way that you think of plane waves, for which the two statements above have direct analogues:
and
in the more specific senses that
and
So: which of these two properties is more important? it depends! it depends on what you're doing and what you care about. Maybe you need to combine the two on equal footings, maybe you need to put more weight on one or the other.
Can you forget about the second statement and just focus on the eigenfunction properties? maybe! it depends on what you're doing and what you care about. If you continue analyzing the problem, both properties will come into play eventually, but where and when depends on the path you take.
(As a rule of thumb, if a statement is unclear and it is not being actively used in the context you're in, then: yes, it is probably safe to put a pin on it and move on, and to return to understanding what it means and why it holds only when you run into it being actively used. In many cases, actually seeing it used in practice is a massive help into understanding what it's for!)
In any case, though, the completeness properties can indeed seem a little mysterious. They've been handled quite well by the other answers, so I won't belabour that point. Instead, I will fill you in on a secret that very few textbooks will tell you: the spherical harmonics are all polynomials!
More specifically, they are polynomials of the Cartesian coordinates when the position $\mathbf r$ is restricted to the unit sphere. Once you take away all of the trigonometric complexity and so on, the $Y_l^m(\theta,\phi)$ become simple polynomials: \begin{align} Y_0^0(\mathbf r) & = 1 \\ Y_1^{\pm 1}(\mathbf r) & = x\pm i y \\ Y_1^0(\mathbf r) & = z \\ Y_2^{\pm 2}(\mathbf r) & = (x\pm i y)^2 \\ Y_2^{\pm 1}(\mathbf r) & = z (x\pm i y) \\ Y_2^0(\mathbf r) & = x^2+y^2-2 z^2 \end{align} (where I've removed the pesky normalization constants).
The basic concept of building this family of polynomials is quite simple:
That last part might sound mysterious, but it should be relatively easy to see why it forces a preference for the combinations $x\pm iy$: if you rotate by an angle $\alpha$ in the $x,y$ plane, $(x\pm iy)$ transforms simply, to $e^{\pm i\alpha}(x\pm iy)$. If you're wondering, this is the feature that connects to the $Y_l^m$ being eigenfunctions of $\hat L_z$.
With that in place, it is fairly easy to see how the progression goes $$ 1,x\pm iy, z, (x\pm iy)^2, z(x\pm iy),\ldots $$ ... but then what's with the $x^2+y^2-2 z^2$ combination? The answer to that is that, in general, there are six separate quadratic monomials we need to include: $$ x^2,y^2,z^2,xy,xz,yz. \tag{*} $$ We have already covered some of these in $(x\pm i y)^2 = x^2-y^2 \pm 2ixy$, and we've fully covered the mixed terms $xy,xz,yz$, so now we need to include two more: to keep things symmetric, let's say $x^2+y^2$ and $z^2$.
Except, here's the thing: the pure-square monomials $x^2,y^2,z^2$ in $(*)$ are not linearly independent! Why is this? well, because we're on the unit sphere, which means that these terms satisfy the identity $$ x^2+y^2+z^2 = 1, $$ and the constant term $1$ is already in our basis set. So, from the combinations $x^2+y^2$ and $z^2$ we can only form a single polynomial, and the correct choice turns out to be $x^2+y^2-2z^2$, to make the set not only linearly independent but also orthogonal (with respect to a straightforward inner product).
As for the rest of the spherical harmonics $-$ the ladder keeps climbing, but the steps are all basically the same as the ones I've outlined already.
Anyways, I hope this is sufficient to explain why the spherical harmonics are
This is just a fancy way of saying that, if you have a function $f(\mathbf r)$ that's defined for $|\mathbf r|=1$, then you can expand $f(\mathbf r)$ as a "Taylor series" of suitable polynomials in $x$, $y$ and $z$, with a slightly reduced set of polynomials to account for the restrictions on $|\mathbf r|$.