Textbook: Real and Complex Analysis by Walter Rudin
Explanation: Chapters 1, 2, 3, 6, 7 and 8 constitute an excellent general treatment of measure theory. Let me elaborate:
Chapter 1: The notions of an abtract measure space and an abstract topological space are introduced and studied in concurrence. This treatment allows the reader to see the close connections between the two subjects that appear both in practice and in theory. Elementary examples and properties of measurable functions and measures are discussed. Furthermore, Lebesgue's monotone convergence theorem, Fatou's lemma, and Lebesgue's dominated convergence theorem are proven in this chapter. Finally, the chapter discusses consequences of these results. The elegance of the treatment allows the reader to quickly become accustomed to the basic theory of measure.
Chapter 2: This chapter delves further into the intimate connection between topological and measure theoretic notions. More specifically, the chapter begins with a treatment of some important results in general topology such as Urysohn's lemma and the construction of partitions of unity. Afterwards, these results are applied to establish the Riesz representation theorem for positive linear functionals. The proof of this result is long but is nonetheless carefully broken into small steps and the reader should find little or no difficulty in understanding each of these steps. The Riesz representation theorem is applied in a particularly elegant manner to the theory of positive Borel measures. Finally, the existence and basic properties of the Lebesgue measure are shown to be a virtually trivial consequence of the Riesz representation theorem. The chapter ends with a nice set of exercises that discusses, in particular, some interesting counterexamples in measure theory.
Chapter 3: The basic theory of $L^p$ spaces ($1\leq p\leq \infty$) is introduced. The chapter begins with an elementary treatment of convex functions. Rudin explains that many elementary inequalities in analysis may be established as easy consequences of the theory of convex functions and evidence is provided for this claim. In particular, Holder's and Minkowski's inequalities are proven. These results culminate in the proof that the $L^p$ spaces are indeed complex vector spaces. The completeness of the $L^p$ spaces and various important density results are also discussed.
Chapter 6: This chapter discusses the theory of complex measures, and in particular, the Radon-Nikodym theorem. Von Neumann's proof of the Radon-Nikodym theorem is presented and various consequences are discussed ranging from the characterization of the dual of the $L^p$ spaces ($1\leq p\leq \infty$) to the Hahn decomposition theorem. These results culminate in the proof of the Riesz representation theorem for bounded linear functions. A knowledge of chapters $4$ and $5$ are necessary in this chapter although they do not strictly cover measure theory. Uniform integrability and the Vitali convergence theorem are treated in the exercises at the end of the chapter.
Chapter 7: The main topic of this chapter is Fubini's theorem. A wealth of nice counterexamples is discussed and an important application is presented: the result that the convolution of two functions in $L^1$ is again in $L^1$. A wonderful feature of this treatment is the generality; the result is established in one of the most general forms possible.
Chapter 8: This chapter treats differentiation of measures and the Hardy-Littlewood maximal function which is an important tool in analysis. A number of applications are presented ranging from a proof of the change of variables theorem in Euclidean $n$-space (in a very general form) to a treatment of functions of bounded variation and absolute continuity. Several results from this chapter are also used later in this book; most notable is the use of the differentiation theorem of measures in the study of of harmonic functions in chapter 11.
Let me summarize with some general comments regarding the book:
Prerequisites: A good knowledge of set-theoretic notions, continuity and compactness suffice for the chapters that I have described. An at least rudimentary knowledge of differentiation and uniform convergence is very helpful at times. One need not be acquianted with the theory of the Riemann integral beforehand although one should at least be acquianted with its computation. In short, a knowledge of chapters 1, 2, 3, 4 and 7 of Rudin's earlier book Principles of Mathematical Analysis is advisable before one reads this textbook.
Exercises: The exercises in this textbook are wonderful. Many of the exercises build an intuition of the theory and applications treated in the text and therefore it is advisable to do as many exercises as possible. However, you should expect to work to solve a few of the exercises. A number of important concepts such as convergence in measure, uniform integrability, points of density, Minkowski's inequality for convolution, inclusions between $L^p$ spaces, Hardy's inequality etc. are treated in the exercises. However, if you are truly stuck you will find that many of these results are either theorems or exercises with detailed hints in other textbooks. (E.g., Folland's Real Analysis.)
Content: I have already described the content in some detail but let me say that the content is about exactly what one needs to study branches of mathematics where measure theory is applied. Of course, this is with the assumption that one at least attempts as many exercises as possible since a number of important results (from probability theory, for example) are treated in the exercises.
Style: The proofs in Rudin are (with possibly minor exceptions) complete. Unlike a number of other mathematics textbooks, Rudin prefers not to leave any parts of proofs to the reader and instead focusses on giving the reader non-trivial exercises as practice at the end of each chapter. The book reads magnificently and the flow of results is excellent; almost all results are stated in context. It is fair to say that the main text of the book lacks examples, which is perhaps one of the only points of complaints by students, but the exercises do contain examples. Finally, the book is rigorous and is completely free of mathematical errors.
I hope this review of Rudin's Real and Complex Analysis is helpful! I have read virtually the entire book (over $4$ months) and I found it to be one of the most enjoyable experiences of my life. It really motivated me to delve deeper into analysis. Perhaps the same will be true for you. I certainly recommend this book with my deepest enthusiasm.
I believe the answer should depend on your background, aspirations, whether you want a theoretical or applied reference,
In my opinion, a very good book which basic measure theory and discusses various types stochastic processes such as Markov, Levy and Brownian motion is: E. Cinlar, Probability and stochastics, Springer editions, 2011. It also has exercises in almost every (no pun intended) section. I have found this book particularly helpful and comprehensive and this would be my #1 recommendation.
My second recommendation is a more advanced text which would be suitable for either advanced university students or graduate students. This is: D.A. Levin, Y. Peres and E.L. Wilmer, Markov Chains and Mixing Times, 2009. Although the material this book presents is quite advanced, the presentation is rather comprehensible accompanied by many examples. At the end of every section you can find exercises.
I also very much like the lecture notes of Prof. Oliver Knill, Probability and stochastic processes with applications, Harvard Math. Dept., 2008. These notes are replete of nice examples and exercises. Chapter 3 is devoted to discrete time stochastic processes and only a small part of it focuses on Markovian processes which are treated in a more general context and not as a standalone topic.
A good resource for exercises is the book: D. Gusak, A. Kukush, A. Kulik, Y. Mishura and A. Pilipenko, Theory of stochastic processes with applications to financial mathematics and risk theory, Springer 2010. In Chapter 10, "Markov chains: discrete and continuous time", they give 90 exercises and for lots of them they offer hints. In the whole book, they offer a very concise overview of the pertinent theory followed by a torrent of exercises. Markov chains aside, this book also presents some nice applications of stochastic processes in financial mathematics and features a nice introduction to risk processes.
In case you are more interested in stochastic control, there is an old book, from 1971 by H. Kushner which is considered a standard reference (I've seen it being cited in many papers). The citation is: Kushner, Introduction to stochastic control, Holt, Rinehart and Winston, 1971. It has many exercises and examples and the author focuses mainly on Markov models.
Although you have explicitly asked for a book with lots of exercises, I cannot help not mention: O.L.V. Costa, M.D. Fragoso and R.P. Marques, Discrete-time Markov Jump Linear Systems, Springer 2005. The book offers a rigorous treatment of discrete-time MJLS with lots of interesting and practically relevant results.
Finally, if you are interested in algorithms for simulating or analysing Markov chains, I recommend: Haggstrom, O. Finite Markov Chains and Algorithmic Applications, London mathematical society, 2002. There you can find many applications of Markov chains and lots of exercises.
Best Answer
"One Thousand Exercises in Probability" by Grimmett and Stirzaker is a possible suggestion, though INMHO not as promising as it sounds. Some pros and some cons:
(My) conclusion: It will certainly help you but keep looking around.
For books with clear, well-written solutions, you could also check Hoel Port Stone "Introduction to Probability Theory" and Bertsekas, Tsitsiklis "Introduction to Probability 2nd Edition". The solutions may also be found online (for sure for the second one). But these cover more basic subjects.