Solved – Is matlab/octave widely used for prototyping in ML/data science industry

machine learningMATLABpython

In the second lesson on Machine Learning (https://www.coursera.org/learn/machine-learning/lecture/olRZo/unsupervised-learning), Prof. Andrew Ng from Stanford (https://en.wikipedia.org/wiki/Andrew_Ng) mentions that Matlab/Octave is widely used in the Machine learning industry to prototype.
I did quite a bit of research before settling on learning Python, as it seemed to be more applicable for real-life problems. I have used Matlab ~ 2 years back, and I am wondering if I should really go back to Matlab, because of his statement here.

However, I have also heard other arguments, that the reason why Matlab/Octave is still used in this course is because this course started in 2011, when python was not as popular or widely used in ML, as a result most of the algorithms was hard to get, or had to be handcoded in Python.

I know a lot of you work in the ML/data science industry, so I was wondering:

  1. Is Matlab/Octave that widely used in ML/data science industry?

  2. Why so, especially since numpy/pandas have a lot of matrix algaebra capabilities?

Best Answer

While Matlab certainly remains a primary tool in much of academic science and engineering, I do not see it used extensively in data science. The primary reason, as I see it, is R's (and Python-Pandas) extensive use of data frames and reference-by-name ecosystem. Matlab is designed to work with matrices, and while you can get Matlab to work with tables and group by categorical variables (e.g., varfun), it's often terribly cumbersome and less intuitive. R and Python employ a syntax that is conducive to thinking-and-coding-as-you-go, almost like writing a data story. Matlab becomes quite verbose in this context, and often requires multi-line solutions for problems R/Python can attack with a fraction of text (though perhaps double or triple the time). To Matlab's credit, that isn't its primary use case. If you want to do serious optimization and simulation, you'll age waiting for R to complete, while Matlab barely sweats. But if you want to explore and model your data in a thoughtful, principled way, R and Python are often better suited, in my opinion. To each their own, but Matlab just wasn't designed for the types of tasks data scientists face.