You can indeed store the information with less data. The SVD gives the best $k$-rank approximation of the image. If we take the SVD of the image $I$, then $I = USV$, where $U$ and $V$ are the left and right singular vectors and $S$ is the matrix of singular values. We can then make the $k$-rank approximation by
$I = \sum\limits_{n=1}^k u_n \sigma_n v_n^T.$
WE can then make a good approximation with a fraction of the data. For example here's a $512$ $\times$ $512$ image of Lena.
And here are a few low rank approximations to the image. You can see by $k=50$, we are getting something very similar visually which contains only $51250$ bit of information compared to $512^2$, just less than a quarter of the data.
The topic of low rank approximation is sprinkled throughout Math SE:
Low-rank Approximation with SVD on a Kernel Matrix
Matrix values increasing after SVD, singular value decomposition
The singular value spectrum may span several orders of magnitude. It seems natural that the contributions from the larger values are more important. Numerically, it is difficult to tell whether small singular values are valid or simply machine noise in computing a $0$ singular value. This requires a threshhold to determine which singular values are discarded.
Let's look at the SVD in detail.
Singular Value Decomposition
Every matrix
$$
\mathbf{A} \in \mathbb{C}^{m\times n}_{\rho}
$$
has a singular value decomposition of the form
$$
\begin{align}
\mathbf{A} &=
\mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\
%
&=
% U
\left[ \begin{array}{cc}
\color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}}
\end{array} \right]
% Sigma
\left[ \begin{array}{cccc|cc}
\sigma_{1} & 0 & \dots & & & \dots & 0 \\
0 & \sigma_{2} \\
\vdots && \ddots \\
& & & \sigma_{\rho} \\\hline
& & & & 0 & \\
\vdots &&&&&\ddots \\
0 & & & & & & 0 \\
\end{array} \right]
% V
\left[ \begin{array}{c}
\color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \\
\color{red}{\mathbf{V}_{\mathcal{N}}}^{*}
\end{array} \right] \\
%
& =
% U
\left[ \begin{array}{cccccccc}
\color{blue}{u_{1}} & \dots & \color{blue}{u_{\rho}} & \color{red}{u_{\rho+1}} & \dots & \color{red}{u_{n}}
\end{array} \right]
% Sigma
\left[ \begin{array}{cc}
\mathbf{S}_{\rho\times \rho} & \mathbf{0} \\
\mathbf{0} & \mathbf{0}
\end{array} \right]
% V
\left[ \begin{array}{c}
\color{blue}{v_{1}^{*}} \\
\vdots \\
\color{blue}{v_{\rho}^{*}} \\
\color{red}{v_{\rho+1}^{*}} \\
\vdots \\
\color{red}{v_{n}^{*}}
\end{array} \right]
%
\end{align}
$$
The connection to the row and column spaces follows:
$$
\begin{align}
% R A
\color{blue}{\mathcal{R} \left( \mathbf{A} \right)} &=
\text{span} \left\{
\color{blue}{u_{1}}, \dots , \color{blue}{u_{\rho}}
\right\} \\
% R A*
\color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)} &=
\text{span} \left\{
\color{blue}{v_{1}}, \dots , \color{blue}{v_{\rho}}
\right\} \\
% N A*
\color{red}{\mathcal{N} \left( \mathbf{A}^{*} \right)} &=
\text{span} \left\{
\color{red}{u_{\rho+1}}, \dots , \color{red}{u_{m}}
\right\} \\
% N A
\color{red}{\mathcal{N} \left( \mathbf{A} \right)} &=
\text{span} \left\{
\color{red}{v_{\rho+1}}, \dots , \color{red}{v_{n}}
\right\} \\
%
\end{align}
$$
You are using is $\mathbf{S} \, \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*}$. This ignores the null space contributions in red.
A rank $\rho = 3$ approximation would look like this;
$$
\mathbf{S}_{3} \, \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} =
\left[ \begin{array}{cccc|cc}
\sigma_{1} & 0 & 0 \\
0 & \sigma_{2} & 0 \\
0 & 0 & \sigma_{3} \\
\end{array} \right]
%
% V
\left[ \begin{array}{c}
\color{blue}{v_{1}^{*}} \\
\color{blue}{v_{2}^{*}} \\
\color{blue}{v_{3}^{*}} \\
\end{array} \right]
%
\in \mathbb{C}^{\rho \times n}
$$
The following sequence shows the Koch snowflake fractals and their singular value spectra. As the object becomes more detailed, the spectrum becomes richer.
Best Answer
If the SVD of $X$ is $X=USV^\top$, then the SVD of $X^\top$ is just the transpose of the prior factorization, $X^\top=VSU^\top$ or $U_1=V$, $S_1=S$ and $V_1=U$.
The principal components of this approach are the singular vectors with the largest singular values. In the implementations, the diagonal matrix $S$ contains the singular values sorted from largest to smallest, so that you only have to consider the first two components. If $X$ has format $25\times 2000$, then the columns of the $25\times 25$ matrix $U$ contain the singular vectors you are interested in.
Update
PCA was originally invented in mechanics to study the kinematics of rigid bodies, for instance the rotation and nutation and oscillations of planets. The idea there is that these kinematics are the same as an ellipsoid that is aligned and shaped according to the principal components of the mass distribution. Any movement of a rigid body can be described as the movement of its center of mass and a rotation around that center of mass.
If the data is not shifted so that the center of mass is the origin, for instance if in 2D all points are clustered around $(1,1)$, then the principal component of the data set will be close to this point $(1,1)$. But to get that point, one could just as well only have computed the center of mass or mean value of all data points. To get the information about the shape of the cluster out of the SVD, you have to subtract the center of mass.
If that is what you mean by 'subtracting the baseline' then all is well in that regard. But still, the application of SVD makes the most sense if you can say that if you flip the sign of an input vector, then this could have reasonable come as well from a measurement in the experiment.
The result of the SVD can be written as $$ X=\sum_{k=1}^r u_k\sigma_k v_k^\top. $$ If one pair of $(u_k,v_k)$ is replaced by $(-u_k,-v_k)$ then noting changes in the sum, the sign change cancels between both factors.
To get the data set of person $j$ out of the matrix $X$ one has to select row $j$ of $X$ as $e_j^\top X$. Now if $X$ gets compressed by using only the terms for the first or first two singular values in the SVD, the approximation of data $j$ set will be $$ e_j^\top X=\sum_{k=1}^2 (e_j^\top u_k)(\sigma_kv_k)^\top =\sum_{k=1}^2 U_{jk}(\sigma_kv_k)^\top. $$ Again, any sign changes in $v_k$ in the computation of the SVD are balanced by sign changes in the coefficients $e_j^\top u_k=U_{jk}$.
One heuristic to make the sign definitive could be to make sure that the entry with largest absolute value in every vector $u_k$ is positive.