I will try to give a more geometric explanation. First note that you can always scale $T$ such that it has norm one. Hence, the Banach-Mazure distance can be rewritten as:
$$
d(X,Y)=\inf\{||T^{-1}||: T\in GL(X,Y), ||T||=1\}
$$
Geometrically, $||T||=1$ means that $T(B_X)\subseteq B_Y$ and no enlargement of $T(B_X)$ will still fit inside $B_Y$ ($B_X$ and $B_Y$ represent the unit balls of the two spaces). On the other hand, we have that
$$
T^{-1}(B_Y)\subseteq||T^{-1}||B_X
$$
or equivalently
$$
B_Y\subseteq||T^{-1}||T(B_X)
$$
Threfore:
$$
T(B_X)\subseteq B_Y\subseteq||T^{-1}||T(B_X)
$$
Thus, geometrically, $||T^{-1}||$ represent the smallest amount by which you must increase $T(B_X)$ such that it contains $B_Y$. The Banach-Mazur distance represents the infimum of such enlargments, taken over all linear isomorphism that send $B_X$ inside $B_Y$.
For a perhaps a better intuitive understanding, take $B_X$ to be the unit sphere. Then for any isomorphism $T$, $T(B_X)$ is going to be an ellipsoid. For the banach-Mazur distance, you are looking for the "best fit" ellipsoid. That is, you are looking for the ellipsoid that fits inside $B_Y$ (touching the boundary), such that the enlargement required for this ellipsoid to contain $B_Y$ is as small as possible.
Yes, there is a similar notion for infinite dimensional Banach spaces, with the convention that when $X$ and $Y$ are not isomorphic, $d(X,Y)=\infty$.
For the beginning the theorem should be read as follows:
Theorem For every normed space $X$ there is isometric embedding into $C(K)$ for some compact space $K$.
Proof. Consider $K:=\operatorname{Ball}_X(0,1)$ with the weak-$^*$ topology. By the Banach-Alaoglu theorem it is compact. By definition of the weak-$^*$ topology the map
$$
J(x): K\to\mathbb{C}:f\mapsto f(x)
$$
is continuous for each $x\in X$. So we have a well-defined map
$$
J:X\to C(K):x\mapsto J(x)
$$
Note that
$$
\Vert J(x)\Vert=\sup\{|J(x)(f)|:f\in K\}=\sup\{|f(x)|:f\in \operatorname{Ball}_X(0,1)\}=\Vert x\Vert
$$
for each $x\in X$. In the last step we use a corollary of the Hahn-Banach theorem. Thus $J$ is an isometric embedding.
Note that the corollary of the Hahn-Banach theorem does not require completeness, so its usage was valid in the proof above.
Best Answer
The midpoint-affine property $$f\left(\dfrac{x+y}{2}\right) = \dfrac{f(x)+f(y)}{2} \tag{1}$$ implies being affine under the assumption that $f$ is continuous (which it is, being an isometry). As stated, $(1)$ amounts to the case $t=1/2$ of $$f[(1-t)x+ty] = (1-t)f(x) + t f(y)\tag{2}$$ But applying $(1)$ again, the second time to $x$ and $(x+y)/2$, yields $(2)$ for $t=1/4$. Similarly, applying $(1)$ to $y$ and $(x+y)/2$ yields $(2)$ for $t=3/4$.
Continuing this process, we obtain $(2)$ for all dyadic rationals in $(0,1)$: numbers of the form $k/2^m$, $0<k<2^m$. These are dense in $[0,1]$ and since both sides of $(2)$ are continuous with respect to $t$, equality $(2)$ holds for all $t\in [0,1]$.