Machine Learning – Difference Between Supervised Machine Learning and Design of Experiments

experiment-designmachine learningmathematical-statistics

I'm an experimental physicist by training and have used standard statistical methods to analyze data, and the design of experiments (DOE) framework to develop models of systems by varying inputs and measurement outputs.

Recently, I've been looking into the use of machine learning and I'm trying to figure if there's any utility/benefit over DOE.

I'm hoping someone on this forum can either validate the way I'm thinking about supervised machine learning, or point out what I'm missing.

I've basically come to the conclusion that supervised machine learning is a method to compute a transfer function of a system given the training data is a set of data that connects the set of inputs with what the output truth should be.

Notwithstanding the machinery that figures out the transfer function based on the training set, what is the difference between DOE and supervised machine learning in terms of the accuracy or other performance measure of the transfer function?

Best Answer

Your question is difficult to answer because there is no "supervised ML algorithm". There are a large number of different ML algorithms that can be optimized in a supervised fashion, each with their strengths and weaknesses.

On a very abstract level, you can define Machine Learning (ML) as a search through some space $P$ for a parameterization ($\theta$) of a given model $M$ such that $M(x;\theta)$ gives a minimal value (though not always globally) for the cost function ($\mathcal{C}$) and input ($x$). More formally:

$$\arg\min_{\theta\in P} \mathcal{C}(M(\theta))$$

For supervised learning, one form the cost function can take is (given $y$ as ground truth):

$$\mathcal{C}(M(x; \theta)) = ||M(x;\theta) - y||_2$$

Any search through $P$ that minimizes the cost function can fit in this framework, and thus you can claim DOE as a ML algorithm if you want. Specifically, an ML algorithm is defined by: the optimization technique employed, the model used, and your cost function. If you fill those in for DOE, you can start to compare it against other ML algorithms.