I'll say a few words about how I think about covariant derivatives, which is really just expanding on janmarqz's comment (hopefully others will contribute their own viewpoints as well):
For me, the most important geometric idea behind a covariant derivative $\nabla$ is that given a curve $\gamma$ in a manifold $M$, $\nabla$ gives you an isomorphism between the tangent spaces $T_{\gamma(t_1)}M$ and $T_{\gamma(t_2)}M$ for any two points on the curve. Mathematically, this isomorphism
$$
P : T_{\gamma(t_1)}M \to T_{\gamma(t_2)}M
$$
is the unique isomorphism with the property that for any $v \in T_{\gamma(t_1)}M$, there exists a vector field (which I'll call $v(t)$) along $\gamma$ such that $v(t_1) = v, v(t_2) = P(v)$, and $\nabla_{\gamma'(t)} v(t) = 0$ for all $t \in [t_1, t_2]$.
This isomorphism is called "parallel transport"; I like to picture a surface embedded in $\mathbb{R}^3$, such as the 2-sphere, and think of parallel transport along a curve $\gamma$ as "dragging" vectors along that curve. (Important remark: the isomorphism obtained depends on the choice of curve $\gamma$ in general.)
Of course, once you have an isomorphism of vector spaces, you get an isomorphism of any of the associated tensor spaces as well. So if $T$ is a $(k,l)$-tensor on $T_{\gamma(t_1)}M$, then we get a $(k,l)$-tensor $PT$ on $T_{\gamma(t_2)}M$.
Now the point is that once you have this "parallel transport" isomorphism, the covariant derivative $\nabla_X \mathcal{T}$ is a literal derivative in the following precise sense: Given a vector $X \in T_pM$, let $\gamma$ be any curve with $\gamma'(0) = X$, and let $P_t$ be the "parallel transport along $\gamma$" isomorphism
$$
P_t : T_{\gamma(t)}M \to T_{\gamma(0)}M \quad (= T_pM).
$$
Then for any tensor field $\mathcal{T}$ on $M$,
$$
\nabla_X \mathcal{T} = \frac{d}{dt}\Big|_{t=0} \Big( P_t \big( \mathcal{T}(\gamma(t)) \big) \Big).
$$
This is a very precise interpretation of the idea that $\nabla_X \mathcal{T}$ gives you the derivative of $\mathcal{T}$ in the direction of $X$.
It all comes down to the coordinates being $x^i$. The outermost operator uses $\frac{\partial}{\partial x^i}=\partial_i$, so this needs to contracted with an inner $\nabla^i$. (I'm putting aside the $|g|^{\pm\frac12}$ factors for the moment, but these result from the identity $\nabla_iV^i=|g|^{-\frac12}\partial_i(|g|^{\frac12}V_i)$.) Since $\phi$ has no spacetime indices, $\nabla^i\phi=g^{ij}\nabla_j\phi=g^{ij}\partial_j\phi=g^{ij}\frac{\partial\phi}{\partial x^j}$.
Best Answer
You have already correctly calculated $g_{11}$ and $g_{22}$ of metric tensor $g_{ik}$. The non-diagonal component $g_{12}=g_{21}=2xy$ (to save time I don't write barr above x and y). $G=det|g_{ik}|=(x^2-y^2)^2$, $A^1=A^2=x^2+y^2-2xy$ $(A^1=g^{11}A_1+g^{12}A_2)$. Then you can calculate Christoffel symbols (from learning point this is a good approach), but let me go directly to the goal: in case of divergence of a vector we may use the formula for $A_{i;j}g^{ij}$ (; means covariant differentiation): $A_{i;j}g^{ij}=\frac{1}{\sqrt{G}}\frac{d}{dx^i}(\sqrt{G}A^i)$(summation over repeated indexes i)$=\frac{1}{\sqrt{G}}\frac{d}{dx}(\sqrt{G}A^1)+\frac{1}{\sqrt{G}}\frac{d}{dy}(\sqrt{G}A^2)=\frac{1}{x^2-y^2}\frac{d}{dx}\left((x^2-y^2)(x-y)^2\right)+\frac{1}{x^2-y^2}\frac{d}{dy}\left((x^2-y^2)(x-y)^2\right)=2\frac{(x-y)^3}{x+y}$
Hopefully this will be helpful.