This was supposed to be a comment, but it is too long and there may be lessons in it from what I've experienced.
This is a great question (I would add a bounty from my own rep if this wasn't community wiki), and it is also a very personal issue that I struggle with myself. I have a similar style of study to the one you described when it comes to things I am really interested in, rather than things that happen to be part of the syllabus on an undergraduate/graduate course I am taking (those subjects falling into the latter category tend to all get the same treatment from me -- go to the lecture, absorb the main ideas, briefly look at my notes later to see they make sense, then ignore it until I need it for some problem set or an examination).
However, I find that your style of study (call it the hard grift method) means that I am often a little behind classes or lectures, despite the fact that I am trying to pursue a deeper understanding of the material I enjoy. Unfortunately, the style of exam questions means that this level of understanding rarely helps. One can ask oneself whether it is worth the trouble, and ultimately the answer to that question depends on what you want to get out of your learning.*
I am keenly aware of the approach of what Stefan Walter above calls the 'superficial' mathematician, which I do not see as at all disparaging (and for the record, I do not think he does either). I do not really believe in innate talent, but there are many mathematicians smarter than I who seem to pick up just as much knowledge as I might from the hard grift method, by instead coasting from an article to a textbook, to a set of exercises, to a pre-print, while making minimal notes and seemingly picking up the salient points naturally, and then having a fruitful discussion with others about their new findings almost immediately (call this the flowing method). From what I read of Terence Tao's blog this is the natural progression from an undergraduate mathematician to a so-called 'post-rigorous' mathematician.
The flowing method seems to reap more benefits, but it also doesn't seem to be a ticket you can buy. I have a few friends already on their PhDs (I am about to finish my humble MMath) and, without wanting to make this sound like a cop-out, their brains seem to work in a different way to mine. It may very well be the case that I am yet to make the transition because I have not yet put in the hours, but I believe that 'putting in the hours' boils down to passion. If you aren't passionate about what you are studying, you won't put in effective hours, and you won't make the transition to post-rigour.
(Aside: I would like to think that one day I might make that transition, but as it stands, I am not sure the life of a professional mathematician is for me!)
*To answer your question succinctly: you need to find out what it is you want out of your learning. If it is pure mastery, an effective method for you to try might be: stick to hard grift for a little while, but if you find you've reached a level where your intuition is guiding you more than rigour is, then stop and evaluate, and consider taking your learning to a higher level where the details of proofs are not the most important thing any more. In particular, re-read Terry Tao's post in the above link.
If, however, you enjoy learning for its own sake and want to pursue personal understanding (which I think the hard grift method is best suited for) then you should always keep this goal in mind. Personal understanding is more gratifying than pure mastery; it should be the goal of any true autodidact (see the last section of this great article by William Thurston).
You question is waay too many questions in one for this website. Just FYI. Anyway...
"... real mathematics is usually done where you understand 2 or 3 pages a day of a text on your first reading of it. Is this true for graduate-level work for the average student??"
- This completely depends on what the reading is and the person's background in it. Learning something completely new? Then yes, its probably true.
"Do students who follow my immersive way of studying tend to have an advantage over those who don't when we get to grad school?"
- Sure, students who know more math upon entering graduate school have an advantage over the students who were content with just the details given in class. It's probably the self-motivation of the student more so than the actual knowledge that puts the student at an advantage to do well.
"In terms of learning theory and doing exercises, how much importance is recommended I place on each?"
- Exercises confirm that theory was actually learned and understood. If you are doing exercises and you find them "easy", well then you probably have a very solid grasp of the theory. Do what feels right. Learn some theory, go back and see if you understand the theory.
"Are there study techniques used in more advanced math courses (like engaging in discourse with peers, focusing more on memorization before attempting to do problem sets, taking notes in a particular way) that are more fruitful than others?"
- I've found that reading the book and taking notes (well) before class and then paying close attention in class is almost always sufficient to understand the material. I've also found that graduate students many times do not keep up with this regime of reading before the class, whether due to workload or general dislike for the material.
In summary, do what feels right, you'll learn a lot if you stay motivated, and enjoy.
Best Answer
There are lots of ways to partition one's mathematical ability into stages to gauge your progress and readiness for something deeper. The stages are not strictly partitioned, so it is possible to kind of move back and forth between different stages, and you may be at different stages with different areas of study. Here is one such way that applies in this context.
In the first stage, one learns how to compute. There is no theory, no generalities. There are numbers, equations, matrixes, etc, and at this point, the student's sole job is to get good at moving them around. This is the portion of your mathematics upbringing from when you were a small child learning to count, up through whenever you started doing serious mathematics.
In the second stage, you learn what proof is and how to construct them. You learn how to prove standard theorems, and every class you take will prove every theorem, except possible for some very difficult ones that are outside the class. You are expected to learn how these proofs go and how they relate to one another, and maybe even are asked to reproduce them. Most people do this for pretty much all of undergraduate.
In the third stage, instead of learning proofs, you now learn proof techniques. By having a large swath of ideas at your disposal, you no longer need to see every detail of a proof to know how it works (unless it's really technical). When you see a theorem, you don't just rattle off the steps of the proof you have memorized. You think about what kinds of ingredients go into the proof, and you stitch them together. This is not to say that you will be able to reproduce every theorem you ever learned flawlessly - of course people forget things all the time - but it means that knowing what you know about the subject, you can rediscover and reassemble the proof on the fly, even if it means you trip from time to time.
I would say that in all probability from your question, you are in the second stage, and asking about what it means to go into the third stage. Do you have to know every single argument to every theorem to be a good mathematician? No, of course not. Learning every proof to every theorem would take years, and you would never get to advance. On the other hand, the only way to get from the second stage to the third stage is to learn as many proofs as you can, and to analyze them thoroughly. Find their crucial steps, the fundamental things that hold them together, and reflect on them. Then, advancing to the third stage is all about looking for ways to use those fundamental observations for other problems.
Hope this helps!