These days I've been looking for a rigurous proof of the multivariable chain rule and I've finally found one that I think is very easy to understand. I will leave it here (if nobody minds) for anybody searching for this that is not familiar with little-o notation, Jacobians and stuff like this. To understand this proof, all you need to know is the mean value theorem.
Let's say we have a function $f(x,y)$ and $x = x(t), y = y(t)$. Let's also take $z(t) = f(x(t), y(t))$ By definition, the derivative of z $z'(t)$ is
$$ z'(t) = \lim_{\Delta t \to 0}{\frac {f(x(t+\Delta t),y(t+\Delta t)) - f(x,y)}{\Delta t}}$$.
$$ Let \ \Delta x = x(t+\Delta t)-x(t),$$ $$\Delta y = y(t+\Delta t)-y(t)$$
Now I'll take the numerator of the fraction in the limit, and make a small change.
$$ f(x(t+\Delta t), y(t+\Delta t)) - f(x,y) = f(x+\Delta x, y+\Delta y) - f(x,y)$$
$$ = \left[f(x+\Delta x, y+\Delta y) - f(x+\Delta x, y)\right] + \left[f(x+\Delta x, y) - f(x, y)\right]$$
I have just added and substracted $f(x+\Delta x, y)$. For some reason, I will invert the terms.
$$ = \left[f(x+\Delta x, y) - f(x, y)\right] + \left[f(x+\Delta x, y+\Delta y) - f(x+\Delta x, y)\right]$$.
Now, let's define 2 functions and I will name them g and h. First,
$$ Let \ g(x) = f(x, y) \implies g'(x) = \frac {\partial f} {\partial x} $$.
Please note that y is constant here since g is a function of a single variable. Now, by the mean value theorem we have
$$ \exists c_1 \in (x, x+\Delta x) \ so \ that$$
$$\frac {g(x+\Delta x) - g(x)} {\Delta x} = g'(c_1) $$
$$ \Longleftrightarrow $$
$$ f(x+\Delta x, y) - f(x, y) = f_x(c_1, y)\Delta x$$
Similarly, using the function
$$ h(y) = f(x + \Delta x, y) \implies h'(y) = \frac {\partial} {\partial y}f(x+\Delta x, y)$$
We will have by the same logic that
$$ f(x+\Delta x, y + \Delta y) - f(x+\Delta x, y) =
f_y(x + \Delta x, c_2)\Delta y, c_2 \in (y, y+\Delta y) $$
Notice that $c_1$ and $c_2$ are bounded with respect to $\Delta x$ and $\Delta y$
So as $\Delta x \to 0, c_1 \to x$ and as $\Delta y \to 0, c_2 \to y$. By our definition of $\Delta x$ and $\Delta y$, as $\Delta t \to 0$, both $\Delta x$ and $\Delta y$ $\to 0$. So, as $\Delta t \to 0$, $c_1 \to x$ and $c_2 \to y$.
The last step of the proof is to sum this all up, divide by $\Delta t$ and take the limit as $\Delta t \to 0$
$$ f(x(t+\Delta t), y(t+\Delta t)) - f(x, y) = f_x(c_1, y)\Delta x + f_y(x+\Delta x, c_2)\Delta y $$
$$ \lim_{\Delta t \to 0} \frac {f(x(t+\Delta t), y(t+\Delta t))}{\Delta t} = \lim_{\Delta t \to 0} f_x(c_1, y)\frac {\Delta x}{\Delta t} + f_y(x+\Delta x, c_2)\frac {\Delta y}{\Delta t} = f_x(x, y)x'(t) + f_y(x, y)y'(t) \ QED $$
Edit: After a long time I've realised that this proof assumes that $f$ has partial derivatives defined on intervals around the point $(x, y)$ and they are continuous at the point. This is a sufficient condition for the function to be ($\mathbb{R}^2$-)differentiable at $(x, y)$, but it's not equivalent. Yet, the multivariable chain rule works for the function being just differentiable at that point. So for a general proof, one should first understand little-o notation as in the other answers.
Think of $f\colon \mathbb{C} \longrightarrow \mathbb{C}$ as a map $f\colon \mathbb{R}^2 \longrightarrow \mathbb{R}^2$. The jacobian matrix $Jf(p)$ is just the matrix of $Df_p \colon T_p\mathbb{R}^2 \longrightarrow T_{f(p)}\mathbb{R}^2$ with respect to the frame $\frac{\partial}{\partial x}, \frac{\partial}{\partial y}$. Complexify the tangent bundle and change the frame to $\frac{\partial}{\partial z}, \frac{\partial}{\partial \bar z}$. In this new frame we will have
$$ { \large
[Df_p] = \begin{pmatrix}
\frac{\partial f}{\partial z}(p) & \frac{\partial f}{\partial \bar z}(p)\\
\frac{\partial \bar f}{\partial z}(p)& \frac{\partial\bar f}{\partial \bar z}(p)
\end{pmatrix}}
$$
Then, by chain rule, $ [D(f\circ g)_p] =[Df_{g(p)}][Dg_p]$ which reads
$$
\begin{pmatrix}
\frac{\partial f\circ g}{\partial z}(p) & \frac{\partial f\circ g}{\partial \bar z}(p)\\
\frac{\partial \overline{f\circ g}}{\partial z}(p) & \frac{\partial \overline{f\circ g} }{\partial \overline z}(p)
\end{pmatrix}
= \begin{pmatrix}
\frac{\partial f}{\partial z}(g(p)) & \frac{\partial f}{\partial \bar z}(g(p))\\
\frac{\partial \bar f}{\partial z}(g(p)) & \frac{\partial\bar f}{\partial \bar z}(g(p))
\end{pmatrix}
\begin{pmatrix}
\frac{\partial g}{\partial z}(p) & \frac{\partial g}{\partial \bar z}(p)\\
\frac{\partial \bar g}{\partial z}(p) & \frac{\partial\bar g}{\partial \bar z}(p)
\end{pmatrix}
$$
And the result follows comparing the entries of the matrices.
Added: In the frame $\frac{\partial}{\partial x}, \frac{\partial}{\partial y}$ we have for $f(x,y) = (u(x,y) , v(x,y))$
$$
Jf = \begin{pmatrix}
\frac{\partial u}{\partial x} & \frac{\partial u}{\partial y} \\
\frac{\partial v}{\partial x} &\frac{\partial v}{\partial y}
\end{pmatrix}
$$
And the base change matrix to $\frac{\partial}{\partial z}, \frac{\partial}{\partial \bar z}$ is
$$
P = \frac{1}{2}\begin{pmatrix}
1 & -i \\
1 & i
\end{pmatrix}
$$
From the relations $\frac{\partial}{\partial z} = \frac{1}{2}\left(\frac{\partial}{\partial x} - i\frac{\partial}{\partial y}\right)$ and $\frac{\partial}{\partial \bar{z}} = \frac{1}{2}\left(\frac{\partial}{\partial x} + i\frac{\partial}{\partial y}\right)$. Hence the matrix for $Df$ in the $\frac{\partial}{\partial z}, \frac{\partial}{\partial \bar z}$ frame is $P \cdot Jf\cdot P^{-1}$,
$$
\frac{1}{2}\begin{pmatrix}
1 & -i \\
1 & i
\end{pmatrix}
\begin{pmatrix}
\frac{\partial u}{\partial x} & \frac{\partial u}{\partial y} \\
\frac{\partial v}{\partial x} &\frac{\partial v}{\partial y}
\end{pmatrix}
\begin{pmatrix}
1 & 1 \\
i & -i
\end{pmatrix} =
\frac{1}{2}\begin{pmatrix}
1 & -i \\
1 & i
\end{pmatrix}
\begin{pmatrix}
\frac{\partial u}{\partial x} + i\frac{\partial u}{\partial y} & \frac{\partial u}{\partial x} - i\frac{\partial u}{\partial y}\\
\frac{\partial v}{\partial x} + i\frac{\partial v}{\partial y} & \frac{\partial v}{\partial x} - i\frac{\partial v}{\partial y}
\end{pmatrix} =
\frac{1}{2}\begin{pmatrix}
\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} + i\left( \frac{\partial u}{\partial y} - \frac{\partial v}{\partial x} \right) & \frac{\partial u}{\partial x} - \frac{\partial v}{\partial y} - i\left( \frac{\partial u}{\partial y} + \frac{\partial v}{\partial x} \right) \\
\frac{\partial u}{\partial x} - \frac{\partial v}{\partial y} + i\left( \frac{\partial u}{\partial y} + \frac{\partial v}{\partial x} \right) & \frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} - i\left( \frac{\partial u}{\partial y} - \frac{\partial v}{\partial x} \right)
\end{pmatrix}
$$
that agree with the matrix stated if we write $f = u+iv$ and $z=x+iy$.
Best Answer
Wirtinger derivatives ${\partial\over\partial z}$, ${\partial\over\partial\bar z}$ are around since 1926. Some people even refer to Poincaré (1899). But until today there is no commonly accepted paradigm and way of explaining what they should mean.
A proof of the chain rule in a multivariate evironment would encompass some limiting argument. I don't see one here. This means that the author takes the chain rule for maps $f:\>{\mathbb R}^n\to{\mathbb R}^m$ for granted. His fiddling around on the shown page then is only meant to be an exercise in linear algebra.
In my opinion you can talk at the same time about ${\partial f\over\partial z}$ and ${\partial f\over\partial\bar z}$ when $f$ is a function defined for $z=x+iy$, but you cannot talk at the same time about ${\partial g\over\partial w}$ and ${\partial g\over\partial z}$ when $w=f(z,\bar z)$. To sum it up: I cannot take the shown page seriously.