(1) and (2) are both right, but it's just that the $v^{\phi}$ in your two formulas mean different things, and you've unknowingly abused notation by calling them both $v^{\phi}$. This issue boils down to the distinction between the tangent vectors $\frac{\partial}{\partial \phi}$ and $e_{\phi}$. The first vector has norm $r$, while the second vector has norm $1$; and it is precisely this factor of $r$ which is the "discrepancy" you observed among the components.
Note that in the formula
\begin{align}
D_vf(p) &= \sum_{j=1}^n \frac{\partial f}{\partial x^i}\bigg|_p \cdot v^i
\end{align}
we often say "$v^i$ is the component of the vector $v$", but strictly speaking, this is an incomplete sentence. Components with respect to which basis? For this formula to work, the way we have to interpret it is that we have to write a vector $v$ as
\begin{align}
v &= \sum_{i=1}^n v^i \frac{\partial}{\partial x^i}\bigg|_p
\end{align}
In other words, they are the components of $v$ with respect to the basis $\left\{\frac{\partial}{\partial x^i}(p)\right\}_{i=1}^n$ of the tangent space $T_pM$. Once again, said differently, we have $v^i:= dx^i(p)[v]$ (the evaluation of a covector on a vector). In differential geometry, we often deal with such "coordinate induced basis".
However, in vector calculus, people often work with the normalied version of these vectors:
\begin{align}
e_i := \dfrac{\frac{\partial}{\partial x^i}(p)}{\lVert \frac{\partial}{\partial x^i}(p)\rVert}
\end{align}
In the case of polar coordinates in the plane, we have the following vectors: $\frac{\partial}{\partial r}, \frac{\partial}{\partial \phi}$ and their normalized counterparts $e_r, e_{\phi}$. The relation between them is:
\begin{align}
\frac{\partial}{\partial r} &= e_r \quad \text{and} \quad \frac{\partial}{\partial \phi} = re_{\phi} \tag{$*$}
\end{align}
So, now given a vector $v$, we can write it as
\begin{align}
v &= v^r \frac{\partial}{\partial r} + v^{\phi} \frac{\partial}{\partial \phi}
\end{align}
for some numbers $v^r, v^{\phi}\in \Bbb{R}$, OR, we can also write it as
\begin{align}
v &= \xi^r e_r + \xi^{\phi} e_{\phi}
\end{align}
for some OTHER numbers $\xi^r, \xi^{\phi}\in \Bbb{R}$. Now, based on $(*)$, we can deduce that
\begin{align}
\begin{cases}
\xi^r &= v^r \\
\xi^{\phi} &= r v^{\phi} \tag{$**$}
\end{cases}
\end{align}
One last thing: when Wikipedia says $\nabla f = \left( \frac{\partial f}{\partial r}, \frac{1}{r}\frac{\partial f}{\partial \phi}\right)$, it should really specify the basis being used. The explicit expression is:
\begin{align}
\nabla f &= \frac{\partial f}{\partial r} e_r + \frac{1}{r}\frac{\partial f}{\partial \phi} e_{\phi} \\
&= \frac{\partial f}{\partial r}\frac{\partial }{\partial r} + \frac{1}{r^2} \frac{\partial f}{\partial \phi}\frac{\partial }{\partial \phi} \tag{$\ddot{\frown}$}
\end{align}
Now, we are finally ready to resolve the issue. Starting from your equation $(1)$, we have
\begin{align}
D_vf &= \frac{\partial f}{\partial r}v^r + \frac{\partial f}{\partial \phi}v^{\phi}
\end{align}
Next, if we do this from $(2)$, then we have
\begin{align}
\langle \nabla f, v\rangle &= \left\langle\frac{\partial f}{\partial r} e_r + \frac{1}{r}\frac{\partial f}{\partial \phi} e_{\phi},\,\,\, \xi^r e_r + \xi^{\phi} e_{\phi} \right\rangle \\\\
&= \frac{\partial f}{\partial r} \xi^r + \frac{1}{r}\frac{\partial f}{\partial \phi} \xi^{\phi}
\end{align}
where I used the fact that $\{e_r,e_{\phi}\}$ is an orthonormal basis, so the inner product is just the sum of the products of the coefficients. Finally, if we plug in $(**)$ above, we find that
\begin{align}
\langle \nabla f, v\rangle &=
\frac{\partial f}{\partial r} \xi^r + \frac{1}{r}\frac{\partial f}{\partial \phi} \xi^{\phi}
=\frac{\partial f}{\partial r}v^r + \frac{\partial f}{\partial \phi}v^{\phi}
= D_vf
\end{align}
which is of course what we expect, since $\nabla f$ is DEFINED so as to make the equation $\langle \nabla f(p), v\rangle = D_vf(p) = df_p(v)$ work out.
Summary:
Whenever you speak of "components of a vector", you MUST ALWAYS keep track of which basis you're referring to. Very often in Differential geometry/Riemannian geometry, people work with the coordinate-induced basis vectors $\frac{\partial}{\partial x^i}$ (so when people write $v^i$ in this context, it's components relative to this basis), whereas in elementary vector calculus, people work with the normalized vectors $e_i$ (and because this is the only basis they use, when they write $v^i$, they mean the components relative to this basis).
Wikipedia from my experience isn't too consistent regarding the usage, and I recall seeing a single article with both uses simultaneously... which is of course very confusing. My suggestion for the future is to always be cautious of this distinction (there are also several other questions on this site where the entire confusion boils down to the difference between a normalized vs unnormalized basis).
Best Answer
As long as your function $f$ is a real-valued function of a vector-variable you can apply your favorite remainder form of Taylor's theorem from calculus 101 to the auxiliary function $$\phi(t):=f\bigl({\bf x}+t{\bf p}\bigr)\ .$$ E.g., if all the necessary partial derivatives of $f$ are continuous, you have $$f\bigl({\bf x}+{\bf p}\bigr)=\phi(1)=\sum_{j=0}^r {\phi^{(j)}(0)\over j!}+{\phi^{(r+1)}(\tau)\over(n+1)!}$$ for some $\tau\in\>]0,1[\>$. Now express the derivatives of $\phi$ by the partial derivatives of $f$, using repeatedly the chain rule. In the case $r=0$ you obtain $$f\bigl({\bf x}+{\bf p}\bigr)=f({\bf x})+\nabla f({\bf x}+\tau{\bf p})\cdot{\bf p}\ ,$$ and when $r=1$ you have $$f\bigl({\bf x}+{\bf p}\bigr)=f({\bf x})+\nabla f({\bf x})\cdot{\bf p}++{1\over2}\sum_{i,k=1}^n f_{.ik}({\bf x}+\tau{\bf p})\> p_ip_k\ .$$ Here the second partials $f_{.ik}({\bf x}+\tau{\bf p}):={\partial^2 f\over\partial x_i x_k}({\bf x}+\tau{\bf p})$ arise from the chain rule when you compute $\phi''({\bf x}+\tau{\bf p})$.