Respectfully, I disagree with Tony's answer. The infinitesimal Torelli problem fails for $g>2$ at the points of $M_g$ corresponding to the hyperelliptic curves. And in general the situation is trickier than one would expect.
The tangent space to the deformation space of a curve $C$ is $H^1(T_C)$, and the tangent space to the deformation space of its Jacobian is $Sym^2(H^1(\mathcal O_C))$. The infinitesimal Torelli map is an immersion iff the map of these tangent spaces
$$ H^1(T_C) \to Sym^2( H^1(\mathcal O_C) )$$
is an injection. Dually, the following map should be a surjection:
$$ Sym^2 ( H^0(K_C) ) \to H^0( 2K_C ), $$
where $K_C$ denotes the canonical class of the curve $C$. This is a surjection iff $g=1,2$ or $g=3$ and $C$ is not hyperelliptic; by a result of Max Noether.
Therefore, for $g\ge 3$ the Torelli map OF STACKS $\tau:M_g\to A_g$ is not an immersion. It is an immersion outside of the hyperelliptic locus $H_g$. Also, the restriction $\tau_{H_g}:H_g\to A_g$ is an immersion.
On the other hand, the Torelli map between the coarse moduli spaces IS an immersion in char 0. This is a result of Oort and Steenbrink "The local Torelli problem for algebraic curves" (1979).
F. Catanese gave a nice overview of the various flavors of Torelli maps (infinitesimal, local, global, generic) in "Infinitesimal Torelli problems and counterexamples to Torelli problems" (chapter 8 in "Topics in transcendental algebraic geometry" edited by Griffiths).
P.S. "Stacks" can be replaced everywhere by the "moduli spaces with level structure of level $l\ge3$ (which are fine moduli spaces).
P.P.S. The space of the first-order deformations of an abelian variety $A$ is $H^1(T_A)$. Since $T_A$ is a trivial vector bundle of rank $g$, and the cotangent space at the origin is $H^0(\Omega^1_A)$, this space equals $H^1(\mathcal O_A) \otimes H^0(\Omega^1_A)^{\vee}$ and has dimension $g^2$.
A polarization is a homomorphism $\lambda:A\to A^t$ from $A$ to the dual abelian variety $A^t$. It induces an isomorphism (in char 0, or for a principal polarization) from the tangent space at the origin $T_{A,0}=H^0(\Omega_A^1)^{\vee}$ to the tangent space at the origin $T_{A^t,0}=H^1(\mathcal O_A)$. This gives an isomorphism
$$ H^1(\mathcal O_A) \otimes H^0(\Omega^1_A)^{\vee} \to
H^1(\mathcal O_A) \otimes H^1(\mathcal O_A). $$
The subspace of first-order deformations which preserve the polarization $\lambda$ can be identified with the tensors mapping to zero in $\wedge^2 H^1(\mathcal O_A)$, and so is isomorphic to $Sym^2 H^1(\mathcal O_A)$, outside of characteristic 2.
Best Answer
There's a more down to earth way to deal with this, which is already explained in Mumford's GIT: make an fppf (or even etale) surjective base change to acquire a section, use that to define the principal polarization, and then show it is independent of the choice. (Short reason: varying the choice amounts to a morphism from the smooth proper curve to a Hom or Isom scheme that is unramified over the base, hence constant.) Thus, by descent theory one gets the polarization over the original base.
This is related to the same issue which comes up in explaining why a polarization of an abelian scheme need not arise as the "Mumford construction" $\phi_{\mathcal{L}}$ even though it automatically does so on geometric fibers (due to the special nature of $k$-simple finite commutative $k$-groups when $k = \overline{k}$). That is, a definition of "polarization" which is better-suited to the relative case is not to mimic what one traditionally does over an algebraically closed field (the Mumford construction) but rather something which makes more effective use of the Poincar\'e bundle. The possible lack of $\mathcal{L}$ over the base is analogous to the possible lack of a section of the curve to define the principal polarization. See the Wikipedia page on ``abelian varieties'' for more on this. :)