I think, one of the big problem is that aside from theoretical physics (string theory, general relativity), most mathematicians aren't terribly aware of what engineers and scientists use differential geometry for. This certainly makes it difficult to write/plan a course in that regard.
It was only recently when I heard a talk by Alain Goriely did I find out that biologists care about differential geometry too! But during the talk there were quite a few theorems about curves in three dimensional space that I've never heard of, and I do geometry PDEs and general relativity for a living. This at least provides an isolated data point to illustrate the above, on how mathematicians typically don't know what is or is not important for applications to other fields.
Ideally such a course/textbook should be prepared by someone with great interdisciplinary familiarity.
In terms of differential geometry "as a natural extension of calculus", I think you may have better luck going to older textbooks, where instead of calling it differential geometry, the subject is just called "advanced calculus". Quite a few books are written back then with an eye toward the applied mathematician (but of course, I am incapable of giving recommendations).
Let me add that I am currently supervising a third-year undergraduate course in University of Cambridge on differential geometry. It fits half of your bill: it does not assume more than basic calculus and linear algebra (partly due to the funny way the Cambridge maths curriculum is rather scant on analysis); the current set of lecture notes is written by Gabriel Paternain (if you are interested you can try asking him for a copy). Unfortunately the way the degree program works, the course won't attract much non-pure-mathematicians other than the future-theoretical-physicists. So I can't really comment on how well it works for engineers and other scientists.
The course is divided in essentially four parts:
- Definition of manifolds as submanifolds in Euclidean space, diffeomorphisms and smooth maps, Sard's theorem and degree mod 2.
- Curves and surfaces in space. Frenet frame, curvature, torsion of curves; isoperimetric inequality. First and second fundamental form, mean and Gaussian curvature.
- Calculus of variations, geodesics, minimal surfaces.
- More about curvature, leading up to Gauss-Bonnet.
One more note: I just remembered that Gary Gibbons is teaching a course titled "Applications to Differential Geometry to Physics". It is not necessarily elementary, but certainly has a lot of applications. Being taught from the point of view of a polymath, the examples given in the notes do cover some more ground than is typical.
Let me work with $n$ dimensions: you want to study the vector field
$$
X=\sum_{1\le j\le n} a_j(x)\frac{\partial}{\partial x_j},
\tag {1}$$
and in particular find the so-called first integrals of $X$ i.e. the functions $f$ such that $Xf=0$. You introduce the system of ODE:
$$
\dot x(t,y)=a(x(t,y)),\quad x(0,y)=y.
\tag {2}$$
The solutions $t\mapsto x(t,y)$ are the integral curves of $X$.
You realize easily that a function is a first integral iff it is constant along the integral curves of $X$: just compute
$$
\frac{d}{dt}\bigl(f(x(t,y))\bigr)=\sum_{1\le j\le n} \frac{\partial f}{\partial x_j}(x(t,y))a_j(x(t,y))=(Xf)(x(t,y))
$$
It means that solving the PDE (1) is somehow equivalent to solving (2).
Now the notational business. It is tempting to write (2), which is $
\frac{dx_j}{dt}=a_j(x), 1\le j\le n,
$
symbolically as
$$
\frac{dx_1}
{a_1(x)}=\dots=\frac{dx_n}
{a_n(x)}
$$
since they are all equal to $dt$ ! Well just take this as a symbolic notation which eliminates the presence of the parameter $t$.
Now the Cauchy problem for this autonomous vector field $X$: find an hypersurface $\Sigma$ to which $X$ is transverse, i.e. $X$ is not tangent to $\Sigma$. Then the Cauchy problem
$$
\begin{cases}
Xu=f,\quad \\
u_{\vert \Sigma}=g
\end{cases}
$$
has locally a unique solution: this problem is equivalent to the scalar ODE
$$
\frac{d}{dt}\bigl( u(x(t,y))\bigr)=f(x(t,y)),\quad u(x(0,y))=u(y)=g(y) \text{ for $y\in \Sigma$},
$$
so that
$$
u(x(t,y))= u(y)+\int_0^tf(x(s,y)) ds\quad \text{ for $y\in \Sigma$}.
\tag{3}$$
Note that $y$ moves on $\Sigma$ ($(n-1)$ degree of freedom) and $t$ in $\mathbb R$ so that it is a nice choice of coordinates to pick $y\in \Sigma$ and $t\in \mathbb R$.
There are variants of this when the vector field is not autonomous, i.e. is of type
$$\frac{\partial}{\partial t}+
\sum_{1\le j\le n} a_j(t,x)\frac{\partial}{\partial x_j}.
$$
More comments on the quasi-linear case and the general method of characteristics:
the quasi-linear Cauchy problem
$$
\frac{\partial u}{\partial t}+\sum_{1\le j\le n} a_j(t,x, u)\frac{\partial u}{\partial x_j}=b(t,x,u),\quad u(0,x)=u_0(x).
\tag{4}$$
has a linear companion
$$
\frac{\partial F}{\partial t}+\sum_{1\le j\le n} a_j(t,x, v)\frac{\partial F}{\partial x_j}+b(t,x,v)\frac{\partial F}{\partial v}=0,\quad F(0,x,v)=v-u_0(x)
\tag{5}$$
where $t,x,v$ are independent variables. It is not difficult to solve using the linear method of characteristics outlined above. Then since $\partial F/\partial v=1$ at $t=0$, the equation
$
F(t,x,v)=0
$
determines implicitely $v=u(t,x)$ and the expression of derivatives of $u$ in terms of derivatives of $F$, e.g.
$
\partial u/\partial x=-\frac{\partial F/\partial x}{\partial F/\partial v}
$
imply that $u$ solves the Cauchy problem (4). Here also the notational industry is working full throttle. People would write
$$
\dot x=a(t,x,u)\quad \dot u=b(t,x,u)\quad
\text{which is }
\frac{dx_j}{a_j}=\frac{du}{b},\quad 1\le j\le n.
$$
Best Answer
Although small discrete systems are easy to work with, continuum models are easier to deal with than large discrete systems. Whether or not nature is fundamentally discrete, the most useful models are often continuous because the discreteness can only occur in very small scales. Discreteness is useful to include in the model if it occurs in the situation we are interested in. I think this is to a large extent a question of scales of interest.
For example, if I have a mole of gas in a container, I could well model it as individual particles. But if I want a simpler model to work with and I am only interested in the behaviour at scales well above the atomic one, the usual "continuous" fluid mechanics is a good choice. This is because at such scales the gas is essentially scaling invariant (it obeys similar laws if you zoom in) and thus calculus becomes applicable (and very powerful). This is of course not true if I go all the way to the atomic scale, but I am not interested in that scale, so it does not matter if my model treats gas in the same way at those scales as well. Large scale continuous quantities like pressure and density give a good understanding (including the ability to make good predictions quickly) and that should not be neglected. (Of course, if I want something more coarse, I can go to a thermodynamic description. Either way, modelling includes a step where the number of particles is taken to infinity to simplify mathematics.)
The "scales of interest" phenomenon happens in both directions; we may neglect both too small and too large scales. For example, it might be a good idea to model a long rod by an infinitely long one (thus in a sense removing discreteness from the model). Then one can apply Fourier analysis or any other such tools that assume that the rod is infinitely long and mathematics becomes easier. This is maybe more common with respect to time than length: Fourier or Laplace transforms with respect to time are used for systems that have finite lifetime. If we are not interested in very large scales, we can assume our system to be infinitely large.
Discrete models are probably useful if nature has genuinely discrete structure (regarding the physical system in question) and we are interested in phenomena at the scale where discreteness is visible. But seen on a larger scale, a discrete model would contain something (particles or some other discrete structure) that we cannot measure and might not even be interested in. Something that cannot be measured and does not have a significant impact on the behaviour of the system should be left out of the model. This is related to the observation that continuum models often work well for large discrete systems.
Let me conclude with an observation that is easy to miss because we are so used to it: At human scales nature seems continuous.