To prove L'Hôpital's rule, the standard method is to use use Cauchy's Mean Value Theorem (and note that once you have Cauchy's MVT, you don't need an $\epsilon$-$\delta$ definition of limit to complete the proof of L'Hôpital). I'm assuming that Cauchy was responsible for his MVT, which means that Bernoulli didn't know about it when he gave the first proof. So what did he do instead?
[Math] How did Bernoulli prove L’Hôpital’s rule
differential-calculusho.history-overviewreal-analysis
Related Solutions
This is a good question, given the way calculus is currently taught, which for me says more about the sad state of math education, rather than the material itself. All calculus textbooks and teachers claim that they are trying to teach what calculus is and how to use it. However, in the end most exams test mostly for the students' ability to turn a word problem into a formula and find the symbolic derivative for that formula. So it is not surprising that virtually all students and not a few teachers believe that calculus means symbolic differentiation and integration.
My view is almost exactly the opposite. I would like to see symbolic manipulation banished from, say, the first semester of calculus. Instead, I would like to see the first semester focused purely on what the derivative and definite integral (not the indefinite integral) are and what they are useful for. If you're not sure how this is possible without all the rules of differentiation and antidifferentiation, I suggest you take a look at the infamous "Harvard Calculus" textbook by Hughes-Hallett et al. This for me and despite all the furor it created is by far the best modern calculus textbook out there, because it actually tries to teach students calculus as a useful tool rather than a set of mysterious rules that miraculously solve a canned set of problems.
I also dislike introducing the definition of a derivative using standard mathematical terminology such as "limit" and notation such as $h\rightarrow 0$. Another achievement of the Harvard Calculus book was to write a math textbook in plain English. Of course, this led to severe criticism that it was too "warm and fuzzy", but I totally disagree.
Perhaps the most important insight that the Harvard Calculus team had was that the key reason students don't understand calculus is because they don't really know what a function is. Most students believe a function is a formula and nothing more. I now tell my students to forget everything they were ever told about functions and tell them just to remember that a function is a box, where if you feed it an input (in calculus it will be a single number), it will spit out an output (in calculus it will be a single number).
Finally, (I could write on this topic for a long time. If for some reason you want to read me, just google my name with "calculus") I dislike the word "derivative", which provides no hint of what a derivative is. My suggested replacement name is "sensitivity". The derivative measures the sensitivity of a function. In particular, it measures how sensitive the output is to small changes in the input. It is given by the ratio, where the denominator is the change in the input and the numerator is the induced change in the output. With this definition, it is not hard to show students why knowing the derivative can be very useful in many different contexts.
Defining the definite integral is even easier. With these definitions, explaining what the Fundamental Theorem of Calculus is and why you need it is also easy.
Only after I have made sure that students really understand what functions, derivatives, and definite integrals are would I broach the subject of symbolic computation. What everybody should try to remember is that symbolic computation is only one and not necessarily the most important tool in the discipline of calculus, which itself is also merely a useful mathematical tool.
ADDED: What I think most mathematicians overlook is how large a conceptual leap it is to start studying functions (which is really a process) as mathematical objects, rather than just numbers. Until you give this its due respect and take the time to guide your students carefully through this conceptual leap, your students will never really appreciate how powerful calculus really is.
ADDED: I see that the function $\theta\mapsto \sin\theta$ is being mentioned. I would like to point out a simple question that very few calculus students and even teachers can answer correctly: Is the derivative of the sine function, where the angle is measured in degrees, the same as the derivative of the sine function, where the angle is measured in radians. In my department we audition all candidates for teaching calculus and often ask this question. So many people, including some with Ph.D.'s from good schools, couldn't answer this properly that I even tried it on a few really famous mathematicians. Again, the difficulty we all have with this question is for me a sign of how badly we ourselves learn calculus. Note, however, that if you use the definitions of function and derivative I give above, the answer is rather easy.
In the manuscript "Determinationum progressio in infinitum" (pp. 668-675 of Sämtliche Schriften und Briefe, Reihe VII, Band 3, Teil C, available in pdf here), Leibniz writes on p. 673 (with "$\sqcap$" in place of "$=$"):
$$ \odot = \overline{dt}\int\frac{a^2}{a^2 + t^2}. \quad\text{Hence}\quad \overline{d\odot} = \frac{a^2}{a^2 + t^2}\overline{d\overline{dt}} $$
This amounts to asserting that $d[uv] = dv\,du$ where $u=dt$ and $v=\int\frac{a^2}{a^2+t^2}$; and thus differentiating the product wrong, as the editors comment in footnote 14. On p. 668 they take this as grounds to date the manuscript early November 1675, since by November 11 he was pointing out this error (in "Methodi tangentium inversae exempla", quoted by Edwards in KConrad's comment above).
Addendum: The first time Leibniz gets his general rule right appears to be in "Pro methodo tangentium inversa et aliis tetragonisticis specimina et inventa" (dated 27 November 1675; pp. 361-371 of the same Sämtliche Schriften, Reihe VII, Band 5, Teil B; English translation here), where he writes on p. 365:
Therefore $d\overline xy = d\overline{xy}-xd\overline y$. Now this is a really noteworthy theorem and a general one for all curves.
Best Answer
L'Hôpital's rule was first published in Analyse des Infiniment Petits.
According to The Historical Development of The Calculus by Edwards (p. 269),
Edit. J.L. Coolidge explains in The Mathematics of Great Amateurs (see pp. 159-160 of the 2nd edition) that L'Hôpital was interested in calculating
$$\lim\limits_{x\to a}\frac{\sqrt{2a^3x-x^4}-a\sqrt[3]{a^2x}}{a-\sqrt[4]{ax^3}}=\frac{16}{9}a.$$