It seems to me that before much more progress can be made in the calculus of ${}^xy$, more fundamental questions have to be answereed, such as, how to define ${}^xy$ for rational $x$? It's clear how the OP's definition works if $x$ is a non-negative integer; but how do we define ${}^xy$ if, say, $x = 7/2$? What then is "one-half" of an occurrance of $x$ in the exponential "tower" which is supposed to be ${}^xy$?
I am reminded here of the way $x^y$ is extended from integers through the reals, by starting with a careful, consistent and believable definition of $(p / q)^{(r / s)}$ for integral $p, q, r, s$; once we have that, a simple, consistent and believable continuity argument allows us to accept a definition of $x^y$ for real $x, y > 0$. We know what $(p / q)^r = (p^r / q^r)$ means; we know what it means for a positive real $z$ to satisfy $z^s = (p / q)^r$, so we can get a handle on $(p / q)^{(r / s)}$ from which, by continuity, we can generalize to $x^y$. I think an analogous method is needed here, but I don't know what it is. But I think my question of the preceding paragraph might be worth considering early on in this game.
Of course, perhaps there is a (reasonably) simple, consistent and believable argument to contruct ${}^xy$ using $\exp()$, $\log()$, etc., or some sort of differential or similar equation ${}^xy$ must satisfy, or perhaps one could learn something from the $\Gamma$ function and factorials here which would bypass, at least temporarily, the need to address how ${}^{(p / q)}(r / s)$ is supposed to work, but sooner or later the question will have to be faced, I'll warrant.
This is an interesting, though speculative, arena and I am glad to have participated. But until I can answer my own questions to my better satisfaction, I will refrain from further
remarks, except to bid those who are ready to climb such unknown heights, "Excelsior!
Hope this helps, at least with the spirit of the adventure if not with the direction. Happy New Year,
and as always,
Fiat Lux!!!
(1) and (2) are both right, but it's just that the $v^{\phi}$ in your two formulas mean different things, and you've unknowingly abused notation by calling them both $v^{\phi}$. This issue boils down to the distinction between the tangent vectors $\frac{\partial}{\partial \phi}$ and $e_{\phi}$. The first vector has norm $r$, while the second vector has norm $1$; and it is precisely this factor of $r$ which is the "discrepancy" you observed among the components.
Note that in the formula
\begin{align}
D_vf(p) &= \sum_{j=1}^n \frac{\partial f}{\partial x^i}\bigg|_p \cdot v^i
\end{align}
we often say "$v^i$ is the component of the vector $v$", but strictly speaking, this is an incomplete sentence. Components with respect to which basis? For this formula to work, the way we have to interpret it is that we have to write a vector $v$ as
\begin{align}
v &= \sum_{i=1}^n v^i \frac{\partial}{\partial x^i}\bigg|_p
\end{align}
In other words, they are the components of $v$ with respect to the basis $\left\{\frac{\partial}{\partial x^i}(p)\right\}_{i=1}^n$ of the tangent space $T_pM$. Once again, said differently, we have $v^i:= dx^i(p)[v]$ (the evaluation of a covector on a vector). In differential geometry, we often deal with such "coordinate induced basis".
However, in vector calculus, people often work with the normalied version of these vectors:
\begin{align}
e_i := \dfrac{\frac{\partial}{\partial x^i}(p)}{\lVert \frac{\partial}{\partial x^i}(p)\rVert}
\end{align}
In the case of polar coordinates in the plane, we have the following vectors: $\frac{\partial}{\partial r}, \frac{\partial}{\partial \phi}$ and their normalized counterparts $e_r, e_{\phi}$. The relation between them is:
\begin{align}
\frac{\partial}{\partial r} &= e_r \quad \text{and} \quad \frac{\partial}{\partial \phi} = re_{\phi} \tag{$*$}
\end{align}
So, now given a vector $v$, we can write it as
\begin{align}
v &= v^r \frac{\partial}{\partial r} + v^{\phi} \frac{\partial}{\partial \phi}
\end{align}
for some numbers $v^r, v^{\phi}\in \Bbb{R}$, OR, we can also write it as
\begin{align}
v &= \xi^r e_r + \xi^{\phi} e_{\phi}
\end{align}
for some OTHER numbers $\xi^r, \xi^{\phi}\in \Bbb{R}$. Now, based on $(*)$, we can deduce that
\begin{align}
\begin{cases}
\xi^r &= v^r \\
\xi^{\phi} &= r v^{\phi} \tag{$**$}
\end{cases}
\end{align}
One last thing: when Wikipedia says $\nabla f = \left( \frac{\partial f}{\partial r}, \frac{1}{r}\frac{\partial f}{\partial \phi}\right)$, it should really specify the basis being used. The explicit expression is:
\begin{align}
\nabla f &= \frac{\partial f}{\partial r} e_r + \frac{1}{r}\frac{\partial f}{\partial \phi} e_{\phi} \\
&= \frac{\partial f}{\partial r}\frac{\partial }{\partial r} + \frac{1}{r^2} \frac{\partial f}{\partial \phi}\frac{\partial }{\partial \phi} \tag{$\ddot{\frown}$}
\end{align}
Now, we are finally ready to resolve the issue. Starting from your equation $(1)$, we have
\begin{align}
D_vf &= \frac{\partial f}{\partial r}v^r + \frac{\partial f}{\partial \phi}v^{\phi}
\end{align}
Next, if we do this from $(2)$, then we have
\begin{align}
\langle \nabla f, v\rangle &= \left\langle\frac{\partial f}{\partial r} e_r + \frac{1}{r}\frac{\partial f}{\partial \phi} e_{\phi},\,\,\, \xi^r e_r + \xi^{\phi} e_{\phi} \right\rangle \\\\
&= \frac{\partial f}{\partial r} \xi^r + \frac{1}{r}\frac{\partial f}{\partial \phi} \xi^{\phi}
\end{align}
where I used the fact that $\{e_r,e_{\phi}\}$ is an orthonormal basis, so the inner product is just the sum of the products of the coefficients. Finally, if we plug in $(**)$ above, we find that
\begin{align}
\langle \nabla f, v\rangle &=
\frac{\partial f}{\partial r} \xi^r + \frac{1}{r}\frac{\partial f}{\partial \phi} \xi^{\phi}
=\frac{\partial f}{\partial r}v^r + \frac{\partial f}{\partial \phi}v^{\phi}
= D_vf
\end{align}
which is of course what we expect, since $\nabla f$ is DEFINED so as to make the equation $\langle \nabla f(p), v\rangle = D_vf(p) = df_p(v)$ work out.
Summary:
Whenever you speak of "components of a vector", you MUST ALWAYS keep track of which basis you're referring to. Very often in Differential geometry/Riemannian geometry, people work with the coordinate-induced basis vectors $\frac{\partial}{\partial x^i}$ (so when people write $v^i$ in this context, it's components relative to this basis), whereas in elementary vector calculus, people work with the normalized vectors $e_i$ (and because this is the only basis they use, when they write $v^i$, they mean the components relative to this basis).
Wikipedia from my experience isn't too consistent regarding the usage, and I recall seeing a single article with both uses simultaneously... which is of course very confusing. My suggestion for the future is to always be cautious of this distinction (there are also several other questions on this site where the entire confusion boils down to the difference between a normalized vs unnormalized basis).
Best Answer
Note that $y=f(x)$, $x=g(y)$ $$\begin{align} \frac{dy}{dx}\cdot\frac{dx}{dy}&=1\\ \frac{dy}{dx}&=\frac{1}{\frac{dx}{dy}} \\ \frac{df(x)}{dx}=\frac{1}{\frac{dg(y)}{dy}} \\ f'(x)=\frac{1}{g'(f(x))} \end{align}$$