This is the difference between ordered pairs and $2$-tuples, really.
Abstractly thinking about ordered pairs they satisfy the property that $(a,b)=(c,d)$ if and only if $a=c$ and $b=d$. But in the universe of set theory, we implement this definition as sets.
This means that we need to choose one implementation, and then call it "ordered pairs". The common choice is the Kuratowski definition, although there are certainly others.
After having the notion of an ordered pair, we can define a function to be a particular set of ordered pairs, and then we can define tuples.
Given a set $I$, an $I$-tuple is a function whose domain is $I$. By the axioms of replacement, power set, and separation, we know that any function whose domain is $I$ is also a set itself. So $I$-tuples are in fact sets.
Now a $2$-tuple would be a tuple where $I$ is a set of two elements. Again, we might resort to canonical choice of representatives, since if $I$ and $J$ have the same cardinality there is an easy way to transform an $I$-tuple to a $J$-tuple, and vice versa (by precomposing bijections between the two sets).
So $Z$ is the set of $\{a,b\}$-tuples, such that $z_a\in X$ and $z_b\in Y$. It is not the same set as $X\times Y$, since one of them is a set of ordered pairs, and another is a set of functions whose domain is $\{a,b\}$. But there is a simple bijection between $Z$ and $X\times Y$, as suggested.
Why would we want to do that, you may ask. The answer is easy, tuples generalize much easier. You can define ordered triplets as $(x,y,z)=(x,(y,z))$ or $((x,y),z)$, and you can continue by induction to define ordered tuples of longer and longer lengths. But this induction will not fare nicely to infinite tuples, which are useful to us (what is a sequence if not an infinite tuple?), so instead working with the notion of $I$-tuples, we can easily generalize to the infinite case, just pick an infinite $I$!
So if we have infinitely many $X_i$'s, we can define their "Cartesian product", whose elements are $I$-tuples $x$, where $x(i)\in X_i$. Something repeated-ordered-pairs cannot quite handle.
Best Answer
A general element of $X_1\times X_2$ is an ordered pair $(x_1,x_2)$, whereas a general element of $\prod_{i=1}^2 X_i$ is a function from $\{1,2\}$ to $X_1\cup X_2$ where $f(1)\in X_1$ and $f(2)\in X_2$. That is, the former contains $$ (x_1,x_2) $$ while the latter contains $$ \{(1,x_1),(2,x_2)\} $$ Therefore, the elements of $X_1\times X_2$ and $\prod_{i=1}^2 X_i$ are "cosmetically" different, but there is a bijective correspondence between these two sets, where one of the two above objects is mapped to the other. This is the reason that the point being made should be "promptly forgotten;" the two constructions capture the same concept in different ways, so can be used interchangeably in any application.
It should be noted, then promptly forgotten, that the ordered pair notation $(a,b)$ is actually shorthand for a particular set, usually chosen to be $\{\{a\},\{a,b\}\}$.