There are several misconceptions in the OP about both mathematicians' and physicists' use of the word "vector", and even about what scalars and tensors are. To keep this a concise overview I'll be linking to fuller explanations.
Firstly, anything you've heard about magnitude and direction was just an attempt to help schoolchildren avoid certain fallacies without having to explain the entire concept of a vector space to them. The aim is to make sure they understand that, for example, a particle's momentum points a certain way but its amount of energy doesn't.
In general, vectors are not tuples. Admittedly some sets of tuples satisfy the axioms of a vector space if you define arithmetic the usual way, but vectors are so much more general than that case, as examples discussed above show. What is true in general is that, if a vector space $V$ has a basis of the form $\left\{e_i|i\in I \right\}$, then each vector in $V$ is expressible as a linear combination of the $e_i$. Depending on the details, this "linear combination" might be a sum or an integral. Armed with this, the coefficients used can provide a tuple representation of vectors (although in some cases you need infinitely many numbers), but the vector is an independent object. The map is not the territory. In fact, making a terrain look different by creating a new map that's rotated relative to an old one is a special case of what you'll sometimes here called a basis change. Since you're familiar with $\mathbb{R}^n$, I'll give a simple example. The vectors $\left(\begin{array}{c}
1\\
0
\end{array}\right),\,\left(\begin{array}{c}
0\\
1
\end{array}\right)$ comprise a basis of $\mathbb{R}^2$, but I can rotate a 2D map by an angle $\theta$ because $\left(\begin{array}{c}
\cos\theta\\
\sin\theta
\end{array}\right),\,\left(\begin{array}{c}
\sin\theta\\
-\cos\theta
\end{array}\right)$ comprise a basis too.
I should also point out in passing that, while in some contexts the word "basis" simply means a choice of $\left\{e_i|i\in I \right\}$ for which this can be done, the proper definition requires that the linear combination need only use finitely many of the $e_i$. Many vector spaces of interest that do not have finite dimension nonetheless meet some additional technical conditions that allow the less strict meaning of "basis" to be useful. However, the famous statement that two bases of a vector space have the same cardinality refers to the finite-combinations-only definition.
So that's what mathematicians mean by vector spaces. A vector space is always "over" a field of scalars. Just as a vector is defined as an element of a vector space which in turn has a long definition, a scalar is defined as an element of a field which in turn has a long definition.
Or is it? Let's talk about what physicists really mean when they discuss vectors. On the one hand, they know about all the mathematics I mentioned above. On the other hand, they also want to describe nature in terms of quantities that transform in certain convenient ways when we switch coordinate systems, to exemplify "symmetries". This leads them to define "vector" in a stricter way. For example, one thing schoolchildren aren't told is that, although angular momentum has a magnitude and direction, it's not a vector because of the way it transforms under reflections. The distinction in $\mathbb{R}^3$ between vectors and axial vectors takes some explaining. The confusion is understandable. Position and momentum are "in" $\mathbb{R}^3$ and are vectors; angular momentum is "in" $\mathbb{R}^3$ is an axial vector. The reason is simply that none of these things are really "in" a famous set of tuples, because they're not tuples at all; they're quantities that admit a tuple representation. That's one similarity axial vectors have with "true" vectors.
With the development of differential geometry, we realised there is a more elegant way to talk about all this. Instead of distinguishing between true vectors and axial vectors, we can distinguish between contravariant and covariant vectors, provided our "both types count" definition of vector means "rank one tensor". quantity $T^{\alpha_1\cdots\alpha_p}_{\beta_1\cdots\beta_q}$ with $p,\,q$ non-negative integers is called a tensor of rank $p+q$ and order (or type) $\left(p,\,q\right)$ iff a coordinate transformation of spacetime from $x^\mu$ to $x^{'\nu}$ obeys $$T^{'\alpha_1\cdots\alpha_p}_{\beta_1\cdots\beta_q}=\sum_{\gamma_1\cdots\gamma_p \delta_1\cdots\delta_q}\frac{\partial x^{'\alpha_1}}{\partial x^{\gamma_1}}\cdots\frac{\partial x^{'\alpha_p}}{\partial x^{\gamma_p}}\frac{\partial x^{\delta_1}}{\partial x^{'\beta_1}}\cdots\frac{\partial x^{\delta_q}}{\partial x^{'\beta_q}}T^{\gamma_1\cdots\gamma_p}_{\delta_1\cdots\delta_q}.$$(We never actually write the summation sign; we take for granted that any index that appears twice, once as a subscript and once as a superscript, is summed over all possible values. In relativity, there is one such value for each spacetime dimension.) A tensor of rank $0$ is a scalar, and is unchanged under coordinate transformations. A tensor of positive rank is called covariant if $p=0$, contravariant if $q=0$ and mixed otherwise. Mixed tensors have $p\geq 1$ and $q\geq 1$, so have rank $\geq 2$.
Something that looks like a tensor by virtue of its indices may not transform the right way to actually be a tensor. (Of course, if there are no indices at all something would "look like a scalar", but might not be one.) Here are three important examples.
Fix a basis $\{e_1, \ldots, e_n\}$ of $V$, and consider the dual basis $\{f_1, \ldots, f_n \}$ of $V^\ast$. Then we have a basis
$$\{e_1\otimes f_1,\ldots, e_i \otimes f_j, \ldots, e_n \otimes f_n\}$$
for $V \otimes V^\ast$, and the matrix
$$A = (a_{ij})$$
is just a way of representing the element
$$\sum_{i=1}^n \sum_{j=1}^n a_{ij} \; e_i \otimes f_j \in V \otimes V^\ast.$$
Of course an element of $V \otimes V^\ast$ gives a linear map $V \to V$ by
$$(w \otimes f)(v) := f(v) w$$
and extending by linearity. Given two such elements, we can compose the corresponding functions:
$$(w' \otimes f')(w \otimes f)(v) = (w' \otimes f')(f(v) w) = f(v) f'(w) w' = f'(w) \; (w' \otimes f)(v)$$
so composition of linear maps is given by
$$(w' \otimes f') \circ (w \otimes f) = f'(w) \; (w' \otimes f)$$
extended by linearity. If you write your elements in the $e_i \otimes f_j$ basis and apply this operation to them, you'll see that the usual definition of matrix multiplication pops right out.
Of course all the calculations with explicit tensors above can be rephrased in terms of the universal property of the tensor product if you like.
This is all assuming you want the matrix to represent an element of $V \otimes V^\ast$ rather than an element of $V \otimes V$ or $V^\ast \otimes V^\ast$. But you can work out what should happen in cases like that the same way.
Best Answer
$\newcommand{\Reals}{\mathbf{R}}\newcommand{\Basis}{\mathbf{e}}$"Geometric" here sounds like a breezy way of saying: Tensors are usually expressed and manipulated in terms of a coordinate system, but (thanks to the way a tensor transforms under change of coordinates) a tensor has an existance independent of a coordinate system.
Mathematicians first used calculus to study physical problems in terms of coordinate systems, and only later learned how to assemble these local investigations into global, coordinate-independent (i.e., "geometric") objects and concepts.
For example, a vector field on the unit sphere $S^{2}$ in $\Reals^{3}$ might be described analytically (in a manner amenable to calculus) by covering the sphere with smoothly-overlapping coordinate systems, and writing the vector field as a linear combination of coordinate vector fields in each chart in such a way that the respective local definitions are compatible where the charts overlap.
By contrast, a geometric version of a vector field on the sphere would be to construct a space $TS^{2}$ from the union of the tangent planes of $S^{2}$ (the space $TS^{2}$ "naturally" sits in $\Reals^{3} \times \Reals^{3} \simeq \Reals^{6}$), and to view a vector field as a mapping $V:S^{2} \to TS^{2}$ that assigns a tangent vector at $p$ to each point $p$ of the sphere.
That is, instead of viewing a tensor (field) as a collection of functions associated to a coordinate system, we might ("geometrically") view a tensor as a mapping from one coordinate-independent space to another. The classical transformation rules for tensor components are built into the construction of $TS^{2}$ and its generalizations to "tensor bundles" over an arbitrary "smooth manifold".
There are substantial conceptual and technical benefits to the geometric picture. For instance, tools of topology might tell us "how many times" two surfaces in $TS^{2}$ intersect each other. Since a smooth vector field on $S^{2}$ defines a special type of surface in $TS^{2}$, knowledge about the geometry of $TS^{2}$ might translate into information about vector fields. (As a loose analogy, "any two non-parallel lines in the Euclidean plane intersect precisely once.") In this particular example, it turns out that a "generic" smooth vector field on the sphere has precisely two zeros, counting multiplicity.