I don't think that "contravariant transformation" is established terminology in physics.
The problem with "covariant" is that in physics, this has a wide range of meanings, starting with "involving no unatural choices" up to the definition one sees in differential geometry motivated by general relativity, which is:
For a smooth real Riemann manifold $M$, a tensor $T$ of rank $\frac{n}{m}$ is a linear function which takes n 1-forms and m tangent vectors as input. When you choose a coordinate chart and dual bases on the cotangential space $d x_n$ and on the tangential space $\partial_n$ with respect to this chart, then the tensor has coordinate functions of the form
$$
T^{\alpha, \beta, ...}_{\gamma, \delta,...} = T(d x_{\alpha}, d x_{\beta}..., \partial_{\gamma}, \partial_{\delta}...)
$$
With respect to these bases, a downstairs index is called covariant, an upstairs index is called contravariant. Now, a "covariant equation" or "covariant operation" is one that does not change its form on a coordinate change, which means that if you change coordinates and apply the choordinate change to all covariant and contravariant indices of every tensor in your equation, then you have to get the same equation, but with "indices with respect to the new coordinates".
A simple example would be:
$$
T^{\alpha}_{\alpha} = 0
$$
with the Einstein summation convention: When the same index is used for a covariant and a contravariant index, it is understood that one should sum over all indices of a pair of dual bases.
Physicists would say that this equation is "covariant" because it has the same form in every coordinate chart, i.e. when I apply a diffeomorphism I get
$$
T^{\alpha'}_{\alpha'} = 0
$$
with respect to the new coordinates. Note that since we talk about general relativity, the kind of transformations are implicitly fixed to be changes of charts on a smooth real manifold. As I said before, when physicists talk about different theories, they may implicitly talk about other kinds of transformations. (Maybe you ran into some physicists who said "covariant transformation" when they meant "coordinate change", but me personally, I have not encountered this use of language.)
You can represent "contravariant vectors" as rows and "covariant vectors" as columns all right if you want.
It's just a convention. The dual space of the space of column vectors can be naturally identified with the space of row vectors, because matrix multiplication can then correspond to the "pairing" between a "covariant vector" and a "contravariant vector".
Remember that "covariant vectors" are defined as scalar-valued linear maps on the space of "contravariant vectors", so if $\omega$ is a covariant vector and $v$ is a contravariant vector, then $\omega(v)$ is a real number that depends linearly on both $v$ and $\omega$. If you make $v$ correspond to a column vector, and make $\omega$ correspond to a row vector then $$ \omega(v)=\omega v=(\omega_1,...,\omega_n)\left(\begin{matrix}v^1 \\ \vdots \\ v^n\end{matrix}\right)=\omega_1v^1+...+\omega_nv^n. $$
If $\omega$ was the column instead, then the above matrix multiplication would look as $\omega(v)=v\omega$, which would not look as aesthetically pleasing, as we are used to displaying the argument of a function to the right of the function, and in this case $v$ is the argument.
Best Answer
Okay, there are two facts to be taken into account here:
Vectors are elements of a vector space. (Let's say a real, d-dimensional vector space $V$ for concreteness). If you use a basis $ \lbrace e_i \rbrace \subseteq V $ you can express those vectors as a linear combination of elements. ie: for any $v \in V$: $$v = v^{i} e_{i} $$ where $v^{i}$ are real numbers, called the components of $v$ with respect to $\lbrace e_{i} \rbrace$.
Covectors are elements of the dual space $V^{*}$ of $V$. You can also choose a basis $\lbrace \epsilon^{i} \rbrace \subseteq V^{*}$ to express these objects as linear combinations in a similar fashion. ie: for any $\omega \in V^{*}$: $$\omega = \omega_{i}\epsilon^{i}$$ where $\omega_{i}$ are also real numbers, called the components of $\omega$ with respect to $\lbrace \epsilon^{i} \rbrace$.
The first choice.
In principle, the bases $\lbrace e_{i} \rbrace$ and $\lbrace \epsilon^{i} \rbrace$ are not related in any way. However, in order to simplify calculations, we often choose a very special basis for the dual space: The dual basis is the unique basis in the dual space such that: $$\epsilon^{i}(e_{j}) = \delta^{i}_{j}$$
(Note the two different workings of the word "dual": while dual space it means the space $V^{*} := Hom(V,\mathbb{R})$, dual basis refers to the uniquely defined basis on $V^{*}$ such that $\epsilon^{i}(e_{j}) = \delta^{i}_{j}$).
Now suppose we want to make a change of basis (AKA linear transformation) in $V$. Let's say our vector $v = v^{i}e_{i}$ can be written in terms of the new basis $\lbrace a_{i} \rbrace$ as: $$v = v^{i} e_{i} = w^{j}a_{j}$$ While the basis transforms with a certain matrix: $e_{i} = \Lambda^{j}_{i} a_{j} $, the components with respect to that basis transform with the inverse of that matrix: $v^{i} = w^{j} (\Lambda^{-1})^{i}_{j} $
Now, when we pair up the elements of the basis $\lbrace e_{i}\rbrace$ and the elements of the basis $\lbrace \epsilon^{i}\rbrace$ we would like our dual basis convention to remain true, so any change in basis $\Lambda$ on $V$ will induce a change of basis on $V^{*}$. $$\delta^{i}_{j} = \epsilon^{i}(e_{j}) = \epsilon^{i}(\Lambda^{k}_{j}a_{k}) = \Lambda^{k}_{j}\epsilon^{i}(a_{k}) = ...$$
The matrix that relates $\lbrace \epsilon^{i} \rbrace$ with the new basis (let's call it $\lbrace \alpha^{i} \rbrace$) on the dual space needs to be the inverse $\Lambda^{-1}$ of $\Lambda$ in order to satisfy the relation $\alpha^{i}(a_{j}) = \delta^{i}_{j}$. $$...=\Lambda^{k}_{j}\epsilon^{i}(a_{k}) = \Lambda^{k}_{j}(\Lambda^{-1})^{i}_{l}\alpha^{l}(a_{k}) = \Lambda^{k}_{j}(\Lambda^{-1})^{i}_{l} \delta^{l}_{k}$$ $$=\Lambda^{k}_{j}(\Lambda^{-1})^{i}_{k} = \delta^{i}_{j}$$
The second choice:
We use the word "covariant" to describe the way the basis of $V$ transforms.
From this, we start calling "contravariant" the way the basis of $V^{*}$ transforms, because it needs to use the inverse transformation in order to keep the duality convention.
We call the components of the elements of $V$ "contravariant" because, as we saw before, they need to transform inversely to the basis of $V$ in order to keep invariance.
Finally, we call "covariant" the components of the elements of $V^{*}$, because they need to transform with inversely the basis of $V^{*}$ and since their basis transforms contravariantly, they end up transforming with the same matrix as the basis of $V$.
So, in summary:
No. Only the way we call the transformation behaviours would change. What matters is:
No. The components of a vector will always transform with the opposite transformation to the one that transformed the basis, regardless of what specific basis is that.
The gradient of a function has covariant components because it naturally is a map $TM \rightarrow \mathbb{R}$. It takes a vector and gives you the directional derivative of the function in that direction. So it is an element of $T^{*}M$ (the dual space to $TM$), whose basis has a transformation behaviour (contravariant) opposite to that of the basis of $TM$ (covariant).