I'm teaching such a course at the moment, and whilst my university has high aspirations, I think that the "Not Quite Oxford" appellation applies here as well.
(For those who like the answer first, I pretty much agree with Alex and Robin.)
Here's the approach I'm taking:
Matrices are great, but dull. They're fantastic for actually getting an answer, but doing all the manipulations yourself is fantastically boring and prone to error. Fortunately, computers are really good at doing such manipulations and (as yet) don't get bored. In summary, being able to use matrices is a really Good Thing but actually using them yourself isn't.
A matrix represents (note the word) a linear transformation from one Euclidean space to another. There's an obvious and easy correspondence between the two. So we can use everything we know about matrices to study linear transformations between Euclidean spaces.
But not everything we want to study is a linear transformation from one Euclidean space to another. How about differentiation of polynomials (of some fixed finite order)? It feels like we could do this with a matrix, but can we?
By considering this example, we decide that if we have an isomorphism from our arbitrary space to a Euclidean space, then we can use matrix methods. So we define "finite dimension" (as a whole concept) to mean "there is an isomorphism to some Euclidean space".
Using Gaussian Elimination (a la Robin), we can show that if $V \cong \mathbb{R}^n$ for some $n$ then $n$ is unique. (No hand-waving required.) This allows us to define the actual dimension.
So "finite dimension" means "can make it look like $\mathbb{R}^n$" which means "can use matrix methods". But this approach brings to the surface the isomorphism $V \cong \mathbb{R}^n$ and so it's really easy to focus our attention on the question: "How do I choose between the different such isomorphisms?" (for example, $\operatorname{Poly}_3 \cong \mathbb{R}^4$ by coefficients, or by evaluation at $0,1,2,3$ or some other means).
Finally, for those die-hard basists, I mention that an isomorphism $V \cong \mathbb{R}^n$ is the same thing as a(n ordered) basis of $V$. This comes easily from the fact that giving a(n ordered) $k$-element subset of $V$ is the same as giving a linear transformation $\mathbb{R}^k \to V$, and then we observe:
- $\mathbb{R}^k \to V$ injective if and only if $\{v_1,...,v_k\}$ linearly independent
- $\mathbb{R}^k \to V$ surjective if and only if $\{v_1,...,v_k\}$ spanning
- $\mathbb{R}^k \to V$ isomorphism if and only if $\{v_1,...,v_k\}$ basis
The meta-point of it all is that the students should think of an isomorphism $V \cong \mathbb{R}^n$ as a point of view: a way of looking at $V$. So being able to change ones point of view to suit the circumstances is a very good skill. And that applies as well to bases: thinking of a basis as a set of elements is useful for actually going out and finding one, but thinking of it as an isomorphism $V \cong \mathbb{R}^n$ is very good once you know it exists.
It's probably pretty obvious from the above, but I'll say it anyway. In this, I do consider $\mathbb{R}^n$ to have an obvious choice of basis. I mean, I'd choose it, wouldn't you? There's also a bit of category theory sitting underneath all of this which recognises that $\mathbb{R}^n$ is the free vector space on $\{1,...,n\}$. Also, later in the semester I'll talk about orthonormal bases of (separable) Hilbert spaces. Only I won't, I'll talk about isometric isomorphisms $H \cong \ell^2$.
For more on how I'm doing this, you can look at my recent lectures: here and some other details on our course wiki, in particular at the page on dimension.
(On that, I don't think that all information has to be given in lectures. I think that students should be expected to fill in some gaps by themselves. To forestall a slurry of comments about how I'm encouraging Bad Teaching Again, the gaps should be specifically chosen by the lecturer for this purpose rather than being what the lecturer happens to forget to say.)
I suggest the article A Circular Argument (Fred Richman, The College Mathematics Journal
Vol. 24, No. 2 (Mar., 1993), pp. 160-162.) It may be relevant to your questions. It suggests that (a variant of) the limit $\lim_{x\to 0}\frac{\sin{x}}{x}=1$ is important to the area result of Archimedes which you mention and that the reasoning may be ... circular. Here is: a freely available version.
revised version I think that it is a bit subtle. The right question might be: Who first treated the question as one which could make sense. The answer to that is probably Archimedes. Once you have that (in an acceptably defined way) the result may not be that hard.
Consider first questions simply of inequalities. If a circle is inscribed in a square the Euclid would agree that the area of the circle is less than that of the square because the whole is greater than the part. But Euclid never says that the perimeter is greater than the circumference because they are different kinds of things. Mark Saphir notes that in Book VI Proposition 33, Euclid proves that in circles of equal radii the lengths of two arcs are in equal proportion to the (central) angles cutting them off. Just sticking to one circle for now with center $O$ we understand what it would mean to say that $\angle AOB < \angle COD$ or that $\stackrel{\frown}{AB} < \stackrel{\frown}{CD}$ and also what it would mean to say that one is twice the other. And hence we have that proposition: $\frac{\angle AOB}{\angle COD}=\frac{\stackrel{\frown}{AB}}{\stackrel{\frown}{CD}}$ (But $\frac{\angle AOB}{\stackrel{\frown}{AB}}=\frac{\angle COD}{\stackrel{\frown}{CD}}$ would not make sense.) Again, Euclid could describe the situation that the radius of one circle is twice that of another. And would even agree that the area of the second is four times that of the first. However he would not say that the circumference of the second was larger than that of the first (let alone twice as much.)
Archimedes introduces the concept of concavity and the postulate:
If two plane curves C and D with the same endpoints are concave in the same direction, and C is included between D and the straight line joining the endpoints, then the length of C is less than the length D.
This is intuitive (as befits a postulate) but is not obvious. With this in hand he can say that for a circle of diameter d, the circumference C is something such that p<C<P
where p and P are the perimeters of polygons (of some number of sides, he used 96) inscribed and circumscribed about a fixed circle. If this is granted then p/d < C/d < P/d
and, because we know the bounds are independent of d (thanks to similarity of polygons), we have that his bounds are independent. Implicitly, letting the number of sides increase, we have that C/d must be similarly independent.
Here we see the idea of arc length (for convex curves) as the limit of the length of inscribed polygonal paths (or perhaps the common limit, if it can be demonstrated, of inscribed and tangential paths.)
Best Answer
Since this has already been bumped, I'll copy godelian's reference into an answer: