Consider a family $\{U_i\}_{i\in I}$ of subsets of a set $U$. We can identify this with a family of injections $\{U_i\hookrightarrow U\}_{i\in I}$. When we talk about this family covering $U$, we mean that $\bigcup_{i\in I}U_i = U$. This union glues all the $U_i$ together where they overlap. We can calculate the overlap of each pair of $U_i$ by computing their intersection which is just the pullback of the corresponding injections. We can then compute the union by taking the colimit of the diagram which consists of all the wedges $U_i\leftarrow U_i\cap U_j\to U_j$, i.e. the wedge produced by the pullback projections. We would then say that this family covers $U$ if this colimits is isomorphic to $U$, i.e. if it is surjective (since it will already be injective).
To generalize, we consider an arbitrary family of arrows $\{A_i\to A\}_{i\in I}$ called a sink or a covering family. We can (assuming sufficient limits) compute the pullback for each pair of $A_i$. The resulting collection of pullback wedges is called the kernel of the sink. We can consider the colimit of this. A sink is called (universally) effectively-epic if it is the (pullback stable) colimit of its kernel. This generalizes the notion of covering.
A coverage on a category $\mathcal C$ (from which we can generate a Grothendieck topology if we want) is a pullback stable choice of a collection of covering families for each object of $\mathcal C$. A pair of a category and a coverage (or Grothendieck topology) on it is called a site. The idea here is that we want each covering family to "cover" its target in the above sense. $\mathcal C$, however, is an arbitrary category so there's no guarantee that we have the pullbacks or colimits that we need.
Both of these problems can be solved by embedding $\mathcal C$ into its category of presheaves (of sets) which is complete and cocomplete. We run into another problem now. We can very freely choose the covering families, so there's no reason for a covering family to be (universally) effectively-epic, and even if it was it will rarely be the case that this will be preserved by the Yoneda embedding. The reason for that is the category of presheaves on $\mathcal C$ is the free cocompletion of $\mathcal C$. This means any colimit of representables in the category of presheaves will be a "freely added" colimit and will generally have nothing to do with the colimit of the representing objects if it exists. As an analogy, if we consider the free commutative group generated from $\mathbb Z$ and write $\underline n$ for the generator corresponding to $n\in\mathbb Z$, it won't be the case that $\underline 1+\underline 1 = \underline{1+1}$.
In the free group case, we can force the equalities to hold by quotienting with respect to some relators, e.g. we can add the relators $\underline m+\underline n - \underline{m+n}$, and recover the group of integers. This is one perspective of what's happening with sheaves on a site. A site is a presentation of a category of sheaves. We freely add colimits to $\mathcal C$ by moving to its category of presheaves, and the covering families specify objects that we want to be certain colimits, i.e. to be universally effectively-epic.
To finally get around to addressing your question, the way this is done is that we restrict the category of presheaves to those presheaves for which the covering families, which are cocones for their kernels, look like colimiting cocones. A sheaf is a presheaf that can't tell the difference between a covering family and the actual colimiting cocone of its kernel. Given a covering family $\{A_i\to A\}_{i\in I}$, this means that if $\mathscr F$ is a sheaf, then $$\mathsf{Nat}(\mathsf{Hom}(-,A),\mathscr F)\cong\mathsf{Nat}(\mathsf{Colim}_{i\in I}\mathsf{Hom}(-,A_i),\mathscr F)$$ where, to simplify notation, I'm writing $\mathsf{Colim}_{i\in I}\mathsf{Hom}(-,A_i)$ for the colimit of the image under the Yoneda embedding of the diagram corresponding to the kernel, i.e. a colimit of a bunch of wedges that look like $$\mathsf{Hom}(-,A_i)\leftarrow\mathsf{Hom}(-,A_i)\times_{\mathsf{Hom}(-,A)}\mathsf{Hom}(-,A_j)\to\mathsf{Hom}(-,A_j)$$
We have $$\begin{align}
\mathscr F(A)&\cong\mathsf{Nat}(\mathsf{Hom}(-,A),\mathscr F)\tag{Yoneda}\\
&\cong\mathsf{Nat}(\mathsf{Colim}_{i\in I}\mathsf{Hom}(-,A_i),\mathscr F)\tag{sheaf property}\\
&\cong\mathsf{Lim}_{i\in I}\mathsf{Nat}(\mathsf{Hom}(-,A_i),\mathscr F)\tag{continuity}\\
&\cong\mathsf{Lim}_{i\in I}\mathscr F(A_i)\tag{Yoneda}
\end{align}$$
If you spell out the limit, you'll see that it corresponds exactly to the equalizer of products in $(1)$ only with the intersection being the pullback.
This isn't a preservation of limits because the covering family for $A$ is not necessarily effectively-epic, so $A$ might not be a colimit of the kernel of the covering family at all. There may not even exist a colimit of the kernel of the covering family, or we may not even be able to talk about the kernel at all in $\mathcal C$. Therefore we cannot define sheaves by this limit preservation property, since it would become vacuous if none of the colimits existed in $\mathcal C$. Going the other way, there may be a sink with target $A$ which is effectively-epic, but that may not have been the covering family we chose for $A$. This would be like us saying that $\{U_i\}_{i\in I}$ is a cover for $U$ when $\bigcup_{i\in I}U_i\subsetneq U$.
When the coverage is subcanonical, it will be the case that the covering families will be effectively-epic in $\mathcal C$ if the colimit of the kernel of the covering family exists at all. Thus, sheaves will take those colimits to limits. In the case where all the covering families are effectively-epic in $\mathcal C$, then the coverage will be subcanonical. The case of sheaves on a topological space uses a subcanonical coverage, so you will have this preservation of limits (viewing the colimits in $\mathcal C$ as limits in $\mathcal C^{op}$).
Let $S$ be a sieve on $U$ in $\newcommand\calO{\mathcal{O}}\calO(X)$.
We want to show
$S$ is principal if and only if $S$ is a sheaf on $\calO(X)$.
Principal implies sheaf
First, suppose $S$ is principal, i.e., generated by $V_0\subseteq U$ for some $V_0$.
Let $W_i$, $i\in I$ be a cover of $W$.
We need to show that
$$ SW \to \prod_i SW_i \rightrightarrows \prod_{i,j} S(W_i\cap W_j) $$
is an equalizer diagram. Now for any set $V$, $SV$ is either empty (if $V\not\subseteq V_0$) or $SV$ contains the morphism $V\subseteq U$ if $V\subseteq V_0$.
Then if for some $i$, one of the $SW_i$ is empty, the product in the middle is empty, and $SW$ is empty,
since there is $x\in W_i\setminus V\subseteq W\setminus V$, and the diagram becomes
$$\varnothing\to\varnothing \rightrightarrows \varnothing,$$
which is immediately an equalizer.
On the other hand, if $SW_i$ is nonempty for all $i$, then $W_i\subseteq V_0$ for all $i$, and thus, since $W=\bigcup_i W_i$, $W\subseteq V_0$. Thus the diagram becomes
$$\{*\}\to \{*\} \rightrightarrows \{*\},$$
which is again immediately an equalizer.
Thus principal sieves are sheaves.
Sheaf implies principal
Now suppose $S$ is a sheaf on $\calO(X)$.
Consider the collection $$\mathcal{W} = \{W : S(W) \ne\varnothing \}$$
Clearly $\mathcal{W}$ covers $V:=\bigcup \mathcal{W}$.
Then since
$$
SV \to \prod_{W\in\mathcal{W}} SW \rightrightarrows \prod_{W,W'\in\mathcal{W}} S(W\cap W')
$$
is an equalizer, and since $S(W)$, $S(W\cap W')$ are all nonempty, and thus one element sets,
we have that
$$
SV\to \{*\} \rightrightarrows \{*\}
$$
is an equalizer, so $SV$ is a one element set containing $V\subseteq U$.
Then by construction, $SW\ne\varnothing$ if and only if $W\subseteq V$, so $S$ is the principal sieve generated by $V$. $\blacksquare$
Best Answer
Here are some hints. First, for a preliminary observation: given that we already have functions $P(U) \to P(V)$ for each $V \in S$, and this forms a cone from $P(U)$ to the diagram $P(S)$, I would expect that the statement $$PU = \varprojlim_{V \in S} PV$$ implicitly means that this cone is a limit.
($\Rightarrow$) Given any other cone $f : X \to P(S)$, for each $x \in X$, consider that $S$ is a cover of $U$, and we also have $(f_V(x)) \in \prod_{V\in S} P(V)$.
($\Leftarrow$) Given a covering set $\{ V_i \mid i \in I \}$, the set of $V$ such that $V \subseteq V_i$ for some $i\in I$ will form a sieve; and in particular, for each $i, j \in I$ we have morphisms $V_i \cap V_j \to V_i$, $V_i \cap V_j \to V_j$ which are both in that sieve.