Many topics in linear algebra suffer from the issue in the
question. For example:
In linear algebra, one often sees the determinant of a
matrix defined by some ungodly formula, often even with
special diagrams and mnemonics given for how to compute it
in the 3x3 case, say.
det(A) = some horrible mess of a formula
Even relatively sophisticated people will insist that
det(A) is the sum over permutations, etc. with a sign for
the parity, etc. Students trapped in this way of thinking
do not understand the determinant.
The right definition is that det(A) is the volume of the
image of the unit cube after applying the transformation
determined by A. From this alone, everything follows. One
sees immediately the importance of det(A)=0, the reason why
elementary operations have the corresponding determinant,
why diagonal and triangular matrices have their
determinants.
Even matrix multiplication, if defined by the usual
formula, seems arbitrary and even crazy, without some
background understanding of why the definition is that way.
The larger point here is that although the question asked about having a single wrong definition, really the problem is that a limiting perspective can infect one's entire approach to a subject. Theorems,
questions, exercises, examples as well as definitions can be coming
from an incorrect view of a subject!
Too often, (undergraduate) linear algebra is taught as a
subject about static objects---matrices sitting there,
having complicated formulas associated with them and
complex procedures carried out with the, often for no
immediately discernible reason. From this perspective, many
matrix rules seem completely arbitrary.
The right way to teach and to understand linear algebra is as a fully dynamic
subject. The purpose is to understand transformations of
space. It is exciting! We want to stretch space, skew it,
reflect it, rotate it around. How can we represent these
transformations? If they are linear, then we are led to
consider the action on unit basis vectors, so we are led
naturally to matrices. Multiplying matrices should mean
composing the transformations, and from this one derives
the multiplication rules. All the usual topics in
elementary linear algebra have deep connection with
essentially geometric concepts connected with the
corresponding transformations.
I would add one more observation to the other comments. Let me not worry about (2) vs (3) as the difference is only about the zero module so this is more of a philosophical question than a mathematical one.
I would just like to point out that there is a very useful characterization of depth and dimension of a module, namely Grothendieck's vanishing theorem which says that at any $x\in X$, the local cohomology of $M$ vanishes for $i$ smaller than the depth or larger than the dimension of the module and does not vanish for $i$ equal either the depth or the dimension.
In my eyes this suggest that one should use the dimension of the module in the definition, i.e., use (2).
Another argument to support the use of (2) is that we like to say that CM is equivalent to "$S_n$ for all $n$". Now if you use definition (1) then only modules supported on the entire $X$ have even a chance to be CM, but I don't see how one would gain from assuming that.
More specifically, a module could never satisfy $S_n$ for any $n$ that's larger than the dimension of the module, but not larger than $\dim X$.
Kind of along the same lines, let $A\to B$ be a surjective morphism of rings (commutative with an identity) and $M$ a $B$-module. I.e., $\operatorname{Spec}B$ is a closed subset of $\operatorname{Spec}A$. Now both $\operatorname{depth}M$ and $\operatorname{supp}M$ are independent of the fact whether one views $M$ as a $B$-module or an $A$-module. It is reasonable that then whether it is $S_n$ would be also independent.
The main difference between (1) and (2) is whether one wants to compare to the support of the module (i.e., view it over ring/annihilator) or the whole ring. To me, the former seems more natural. This way a sheaf/module that is $S_n$ on a subscheme remains $S_n$ when viewed on an ambient scheme. The definition (1) seems to prefer to compare to the fixed ring. One way some people try to bridge the gap between the two definitions is to say "$M$ is $S_n$ over its support", meaning that one should mod out by annihilator first before applying (either of the) definition(s). Then the two definitions are equivalent. As for (3), some people go the distance to say "a non-zero module is $S_n$ if..."
Best Answer
Perhaps the mother of all examples is "natural number". You can start an internet flame war by asking whether zero is a natural number.