$\require{AMScd}$
I think you are mistaking between two distinct notions (see also the comments).
The first notion is the arrow category, defined as follow. Let $\mathscr C$ be category. The category $\operatorname{Arr}(\mathscr C)$ (also denoted $\mathscr C^{\mathbf 2}$ or $\mathscr C^\rightarrow$) is the category whose
- objects are the arrow $f$ of $\mathscr C$,
- morphisms $(f\colon a \to b) \to (g \colon c \to d)$ are the commutative square
$$ \begin{CD}
a @>f>> b \\
@VVV @VVV \\
c @>>g> d ,
\end{CD}$$
- composition is the concatenation of such squares.
You can of course apply that definition with $\mathscr C = \mathsf{Cat}$.
The second notion is the enrichment of $\mathsf {Cat}$ over itself. That is, the category $\mathsf{Cat}$ has the property that, for any two objects $A$ and $B$, the hom-set $\hom_{\mathsf{Cat}}(A,B)$ actually carries a category structure in such a way that the composition
$$ \hom_{\mathsf{Cat}}(B,C) \times \hom_{\mathsf{Cat}}(A,B) \to \hom_{\mathsf{Cat}}(A,C) $$
is a functor. The short way to say it is : $\mathsf{Cat}$ is enriched over the (cartesian closed) monoidal category $(\mathsf{Cat},\times,\mathbf 1)$ (where $\mathbf 1$ is the final category).
The two notions are very distinct and not to be confused !
It seems like your question is more about what the 2-morphisms in $\newcommand\Cat{\mathbf{Cat}}[\Cat,\Cat]\newcommand\C{\mathcal{C}}\newcommand\D{\mathcal{D}}$ are, rather than what the data of $Y(\eta)$ is specifically.
Let's do this a little more generally. Let $\C$, $\D$ be (strict) 2-categories. Then $[\C,\D]$ should also be a (strict) 2-category, and we want to understand the 0, 1, and 2-cells.
0-cells:
The objects are strict 2-functors, i.e., functors $F:\C\to \D$ which act on objects, morphisms, and 2-morphisms subject to compatibility criteria. More concretely, once we've decided where $F$ sends objects, then the maps on hom categories
$$F_{X,Y} : \C(X,Y)\to \D(X,Y)$$
should all be functors, and moreover,
$$
\require{AMScd}
\begin{CD}
\C(Y,Z)\times \C(X,Y) @>\circ_{\C,X,Y,Z}>>\C(X,Z)\\
@VF_{Y,Z}\times F_{X,Y}VV @VVF_{X,Z}V\\
\D(FY,FZ)\times \D(FX,FY) @>\circ_{\D,FX,FY,FZ}>>\D(FX,FZ)\\
\end{CD}
$$
should strictly commute.
1-cells:
The morphisms are (strictly) natural families of 1-cells. I.e., given $F,G:\C\to \D$,
a 1-cell from $F$ to $G$ is a family $T_X : FX\to GX$ of 1-cells in $\D$, subject to the requirement that the usual diagram commute strictly
for each 1-cell $f:X\to Y$ in $\C$:
$$
\begin{CD}
FX @>Ff>> FY\\
@VT_X VV @VVT_Y V \\
GX @>Gf>> GY. \\
\end{CD}
$$
2-cells:
Let $F,G :\C \to \D$ be 2-functors, $T,S : F\to G$ be 1-cells between them.
A 2-cell $\alpha : T \to S$ is a natural family of 2-cells. More concretely, it is the choice for every $X\in C$ of a 2-cell in $\D$, $\alpha_X : T_X\to S_X$ natural in the sense that for every 1-cell of $\C$, $f:X\to Y$, we have that the following 2-cells from $G(f)\circ T_X = T_Y\circ F(f)$ to $G(f)\circ S_X = S_Y\circ F(f)$ are equal.
The two cells are the whiskered composites $G(f).\alpha_X$ and $\alpha_Y.F(f)$.
Applying this to $\C=\D=\Cat$
Given a 2-cell $\eta : F\to G$ in $\Cat$, we need to produce for each category $C$ a
2-cell $Y(\eta)_C : Y(F)_C\to Y(G)_C$.
If $X$ and $Y$ are the categories such that $F,G:X\to Y$, then
$Y(F)_C: [Y,C]\to [X,C]$ is the functor $-\circ F$, and similarly for $G$.
Then $Y(\eta)_C$ should be the whiskered composite $-.\eta$.
In other words, for any functor $K:Y\to C$, for all $x\in X$, by definition,
$\eta_X : FX\to GX$, so $K.\eta_X = K(\eta_X) : KFX\to KGX$ is a natural transformation.
Best Answer
I want to point out something potentially misleading about Marek's answer. The n-categories he mentions are not categories, but generalizations of them, so the question still remains, why do categories only form a 2-category, that is, why do people stop after categories, functors, and natural transformations? Why don't people define modifications of natural transformations?
I think it is good to realize that categories really are in an essential way only 2-categorical, if you want interesting higher morphisms you do need to define something like a higher category. One way to think about it is this: natural tranformations are basically homotopies. To make this precise, take I to be the category with two objects, 0 and 1, one morphism from 0 to 1 and the identity morphisms. Then, it is easy to check that to specify a natural tranformation between two functors F and G (both functors C → D) is the same as specifying a functor H : C × I → D which agrees with F on C × {0} and with G on C × {1}.
So then we could get higher morphisms by saying they are homotopies of homotopies, i.e., functors C × I × I with appropriate restrictions. This works, and we indeed get some definition of modification, but it is not interesting as it reduces to just a commuting square of natural transformations, i.e., it can be described simply in terms of the structure we already had.
This is similar to what happens for, say, groups: you can think of a group as a category with a single object where all the morphisms are invertible (the morphisms are the group elements and the composition law is the group product). Then group homomorphisms are simply functors. This makes it sound as if groups now magically have a higher sort of morphisms: natural tranformations between functors! And indeed they do, they are even useful in certain contexts, but they're not terribly interesting: a natural transformation between to group homomorphisms f and g is simply a group element y such that f(x) = y g(x) y -1 . Again, this is described in terms of things we already new about (the group element y and conjugation), and is not really a brand new concept.