I believe actually the opposite of your conclusion is true. In The Disposable Academic, several pointers are given about the low wage premium in applied math, math, and computer science for PhD holders over master's degree holders. In part, this is because companies are realizing that master's degree holders usually have just as much theoretical depth, better programming skills, and are more pliable and can be trained for their company's specific tasks. It's not easy to get an SVM disciple, for instance, to appreciate your company's infrastructure that relies on decision trees, say. Often, when someone has dedicated tons of time to a particular machine learning paradigm, they have a hard time generalizing their productivity to other domains.
Another problem is that a lot of machine learning jobs these days are all about getting things done, and not so much about writing papers or developing new methods. You can take a high risk approach to developing new mathematical tools, studying VC-dimensional aspects of your method, its underlying complexity theory, etc. But in the end, you might not get something that practitioners will care about.
Meanwhile, look at something like poselets. Basically no new math arises from poselets at all. It's entirely unelegant, clunky, and lacks any mathematical sophistication. But it scales up to large data sets amazingly well and it's looking like it will be a staple in pose recognition (especially in computer vision) for some time to come. Those researchers did a great job and their work is to be applauded, but it's not something most people associate with a machine learning PhD.
With a question like this, you'll get tons of different opinions, so by all means consider them all. I am currently a PhD student in computer vision, but I've decided to leave my program early with a master's degree, and I'll be working for an asset management company doing natural language machine learning, computational statistics, etc. I also considered ad-based data mining jobs at several large TV companies, and a few robotics jobs. In all of these domains, there are plenty of jobs for someone with mathematical maturity and a knack for solving problems in multiple programming languages. Having a master's degree is just fine. And, according to that Economist article, you'll be paid basically just as well as someone with a PhD. And if you work outside of academia, bonuses and getting to promotions faster than someone who spends extra years on a PhD can often mean your overall lifetime earnings are higher.
As Peter Thiel once said, "Graduate school is like hitting the snooze button on the alarm clock of life..."
Best Answer
My recommendation: Start here: Measure Theory Made Ridiculously Simple.
Then buy and read Burrill. I know its old, but its super inexpensive on Amazon and a really good read. It covers basics of Real Analysis and Probability Theory.