This follows from Zariski's main theorem if the characteristic is zero and it is false in positive characteristics: consider the the morphism $\mathbb{A}^1 \to \mathbb{A}^1$ given by $x \mapsto x^p$ where $p$ is the characteristic. The statement would also be true in char p if you assume that the general fibre is reduced.
(Note that it suffices to assume that X is integral and Y is normal. $f$ should of course also be surjective.)
The Torelli theorem holds for curves over an arbitrary ground field $k$ (in particular, $k$ need not be perfect). A very nice treatment of the "strong" Torelli theorem may be found in the appendix by J.-P. Serre to Kristin Lauter's 2001 Journal of Algebraic Geometry paper Geometric methods for improving the upper bounds on the number of rational points on algebraic curves over finite fields. It is available on the arxiv:
http://arxiv.org/abs/math/0104247
Here are the statements (translated into English):
Let $k$ be a field, and let $X_{/k}$ be a nice (= smooth, projective and geometrically integral) curve over $k$ of genus $g > 1$. Let $(\operatorname{Jac}(X),\theta_X)$ denote the Jacobian of $X$ together with its canonical principal polarization. Let $X'_{/k}$ be another nice curve.
Theorem 1: Suppose $X$ is hyperelliptic. Then for every isomorphism of polarized abelian varieties $(\operatorname{Jax}(X),\theta_X) \stackrel{\sim}{\rightarrow} (\operatorname{Jac}(X'),\theta_{X'})$, there exists a unique isomorphism $f: X \stackrel{\sim}{\rightarrow} X'$ such that $F = \operatorname{Jac} f$.
Theorem 2: Suppose $X$ is not hyperelliptic. Then, for every isomorphism $F: (\operatorname{Jax}(X),\theta_X) \stackrel{\sim}{\rightarrow} (\operatorname{Jac}(X'),\theta_{X'})$ there exists an isomorphism $f: X \stackrel{\sim}{\rightarrow} X'$ and $e \in \{ \pm 1\}$ such that $F = e \cdot \operatorname{Jac} f$. Moreover, the pair $(f,e)$ is uniquely determined by $F$.
Best Answer
I don't think that map is always surjective. For example, suppose that $Y = X$ is a supersingular elliptic curve. Then the Frobenius map $H^1(X, O_X) \to H^1(Y, O_Y)$ is the zero map (and both are 1-dimensional vector spaces). See for example Hartshorne's chapter on elliptic curves.
Alternately, suppose that $X$ is a scheme over $\mathbb{R}$, and $Y = X \times_{\mathbb{R}} \mathbb{C}$ is the base change. I don't think one should expect that $H^n(X, O_X) \to H^n(Y, O_Y)$ is basically ever surjective unless they are both zero...
With regards to your question though, here's one answer:
Suppose that $X$ is a normal integral scheme of characteristic zero and $Y$ is also integral. Then the natural map $O_X \to f_* O_Y$ splits as a map of $O_X$-modules, say with splitting map $\phi : f_* O_Y \to O_X$ (use the trace map on the fields $K(Y) \to K(X)$ and restrict to the structure sheaves). Now apply the functor $H^n(X, \bullet)$ to the composition (which is an isomorphism): $$ O_X \to f_* O_Y \xrightarrow{\phi} O_X. $$ Clearly one gets that $$ H^n(X, O_X) \to H^n(X, f_* O_Y) = H^n(Y, O_Y) \to H^n(X, O_X) $$ is also an isomorphism and thus $$ H^n(X, O_X) \hookrightarrow H^n(Y, O_Y) $$ injects as desired.
EDIT
Since the author of the question is particularly interested in the case when $f : Y \to X$ is the normalization of an $n$-dimensional $X$, let me try to say a couple things about that case.
Since we have a short exact sequence $0 \to O_X \to f_* O_Y \to C \to 0$, and since $f$ is birational, the support of $C$ has dimension $< \dim X$. Therefore, $H^n(X, O_X) \to H^n(Y, O_Y)$ is surjective as the original question states.
To show injectivity, it is sufficient to show that $H^{n-1}(X, C) = 0$. This will happen certainly if the non-normal locus of $X$ has codimension $\geq 2$. Otherwise, it generally won't happen.
A simple example with curves
Suppose that $X$ is a curve with exactly a node and $Y$ is its normalization (although any singular curve will work). Then $C$ is the skyscraper sheaf supported at a point. In particular $\dim H^0(X, C) = 1$. On the other hand, $\dim H^0(Y, O_Y) = 1 = \dim H^0(X, O_X)$, and so the exact sequence $$ 0 \to H^0(X, O_X) \to H^0(Y, O_Y) \to H^0(X, C) \to H^1(X, O_X) \to H^1(Y, O_Y) \to 0 $$ immediately implies that $H^0(X, C) = \ker H^1(X, O_X) \to H^1(Y, O_Y)$. In particular, the latter map is not injective.