Why was Ramanujan interested in the his tau function before the advent of modular forms? The machinery of modular forms used by Mordel to solve the multiplicative property seems out of context until I know the function's use and value to Ramanujan at the time.
[Math] Ramanujan’s tau function
ho.history-overviewmodular-formsnt.number-theory
Related Solutions
There's a beautiful history behind this.
Basically, Artin and Hecke were working on different sides of this "dihedral modularity conjecture" at the same time (the 20s) and at the same place (Hamburg), but apparently they never discussed this aspect of their research.
So they had the tools to prove this instance of what would become the Langlands program by 1927, but they didn't know it!
There is a brief account of this in Tate's paper "The general reciprocity law" (note that he was Artin's doctoral student), and a more extended historical survey of Artin and Hecke's work during that time on Cogdell's article "On Artin L-functions".
I think this is how the proof would have looked like back in 1927 (although in modern notation, and not in german!)
Arithmetic side (Artin)
Let $\rho:\mathrm{Gal}(L/K)\to \mathrm{GL}_2(\mathbb{C})$ be 2-dimensional dihedral complex representation. From representation theory we know that $\rho$ is monomial, that is, induced from a 1-dimensional representation.
Artin had proved in 1923 that his L-functions behave well under representation theoretic operations and, in particular, induction. Therefore, there is an L-function $L(\varrho,s)=L(\rho,s)$, with $\varrho$ one-dimensional (abelian).
From Artin reciprocity (1927) we have that $L(\varrho,s)=L(\chi,s)$, with $L(\chi,s)$ a Hecke L-function.
The last step is Hecke's proof from 1917 that abelian L-functions are meromorphic for non-trivial characters. Since $\varrho \neq 1$, the original L-function $L(\rho,s)$ is meromorphic on the complex plane, and we have proved the Artin conjecture for dihedral representations.
Automorphic side (Hecke)
Hecke had been studying theta series, and in particular in 1927 he constructed a cusp form $f_\theta$ of weight $1$ as a linear combination of $\theta$-series of binary quadratic forms attached to $K$.
He had alredy proved the basic properties of the L-functions of arbitrary modular forms and Hecke characters, so he knew their functional equation.
In the case of his $f_\theta$ the gamma-factor was very simple, just $\Gamma(s)$.
So, according to Tate, he listed all the Hecke L-functions that shared that same gamma-factor. After weeding out the one coming from Eisenstein series (which in turn correspond to cylic (reducible) two-dimensional representations), he was left with a correspondence $L(\chi,s)=L(f_\theta,s)$.
Arithmetic side revisited
This would have been an easy step for either one of them, if they had known what the other one was up to.
A quick inspection of the gamma-factor of the Artin L-function shows that the only representations for which it equals $\Gamma(s)$ are the ones odd and two-dimensional. Since the other two-dimensional odd representations are irreducible (except the cyclic, which we have alredy mentioned correspond to Eisenstein series), we have showed:
$$L(\rho,s)=L(f_\theta,s)$$
Jacquet-Langlands proof
The first actual proof of the result follows from the converse theorem for $\mathrm{GL}_2$ in "Automorphic forms on GL(2)" (1971). But I don't think they mention the dihedral case in particular. Langlands does, saying that it is implicit in the works of Hecke and Maass, in his 1975 book "Base change for GL(2)".
A different proof follows from the results by Deligne and Serre in "Formes modulaires de poids 1" (1974).
I'm not sure of what relevance Maass' work has in this case. The same goes for some attributions to Brauer, since his induction theorem isn't really needed here.
To answer the actual question, no, there's no direct reference for this result before 1971. That said, technically Artin's 1927 paper implies this case of the (weak) Artin conjecture, and we now know (by a result of Booker, 2003) that this "weak" case implies the strong Artin conjecture.
By "level $\ell$" I assume you mean "level $\Gamma_1(\ell)$".
Here's a proof. By the Eichler-Shimura theorem, the system of eigenvalues associated to the modular form shows up in $H^1(SL(2,\mathbf{Z}),Symm^{k-2}(\mathbf{C}))$. Hence (by some easy commutative algebra) the mod $\ell$ reduction of the system of eigenvalues shows up in $H^1(SL(2,\mathbf{Z}),Symm^{k-2}(\mathbf{F}_\ell))$ and hence, by a standard diagram chase, in $H^1(SL(2,\mathbf{Z}),M)$ for $M$ an irreducible module for $GL(2,\mathbf{F}_\ell)$ (EDIT: here $M$ is a finite-dimensional vector space over $\mathbf{F}_\ell$, so it's just a twist of $Symm^n$ for some small $n$). But any such $M$ is a subquotient of $I:=Ind_{(* *;0 1)}^{GL(2,\mathbf{F}_\ell)}(1)$ so the system of eigenvalues shows up in $H^1(SL(2,\mathbf{Z}),I)$ and hence, by Shapiro, in $H^1(\Gamma_1(\ell),1)$. (EDIT: here $1$ means the trivial 1-d vector space over $\mathbf{F}_\ell$: one now deduces that the system of eigenvalues lifts to a system of evals showing up in $H^1(\Gamma_1(\ell),\mathbf{C})$).
Now using Eichler-Shimura again, this time at level $\ell$, shows that there's a weight 2 level $\Gamma_1(\ell)$ modular form giving rise to the same mod $\ell$ system of Hecke eigenvalues. This last statement is a little disingenuous because Eichler-Shimura only tells you about parabolic cohomology which isn't quite the same as group cohomology. But the extra stuff is all associated to reducible Galois representations so can be dealt with by hand using Eisenstein series.
You'll find these sorts of arguments in papers of Ribet from around 1987-1990. Another great place to look is papers of Ash and Stevens from slightly earlier -- I learnt the argument below from an Ash-Stevens paper.
Best Answer
All questions of the form 'Why was such a mathematician interested in such a subject?' are difficult, and have a tendency to become metaphysical ('why are we doing mathematics in general?", and then "why are we here, anyway?'), but they are even harder when they concern Ramanujan, who had a nonstandard mathematical formation and a very original mind.
You are right that Ramanujan could not have been influenced in his interest in the tau sequence by our modern vision of this function as the prototype of the general sequence of coefficients of modular forms, with all the connections to algebraic geometry and number theory that are now familiar, since on the contrary the modern theory of modular forms was developed by Mordell and Hecke after and motivated by Ramanujan's results and questions about the $\tau$ function.
So how could Ramanujan have been interested in the $\tau$-function? Well, Ramanujan all his life, and well before he came to England and met Hardy, was interested in $q$-series, roughly the study of certain formal power series in one variable ($q$), and he valued very much his results that took the form of non-trivial identities between two $q$-series. It is an old subject, which begins with Euler (for example his "pentagonal number theorem" for $\prod_n (1-q^n)$), and is deeply connected to combinatorics, yet it was at the time of Ramanujan (and to some extent still is) a little bit outside of mainstream research. But from this point of view, the study of the $\tau$-function, defined as the coefficients of $q \prod (1-q^n)^{24}$ fits well into Ramanujan's lifelong interests. And if you're worried about the exponent $24$, remember that Ramanujan dealt with much more baroque formulas.
There is another reason to be interested in the $\tau$-function, namely that it is the sequence of Fourier coefficients of the Weierstrass Delta-function $\Delta(z)$. Now the function $\Delta(z)$ was extremely important (and mainstream) in the mathematics of Ramanujan's time, being central in the theory of elliptic functions (or integrals or curves) and interconnected with the work of many mathematicians of the nineteenth century on complex analysis, Riemann surfaces, and algebraic geometry. Ramanujan was not aware of all these connections before he met Hardy (at least according to the latter, who said Ramanujan had almost no knowledge of complex analysis and the theory of elliptic functions) but after that he become very interested in the subject.