Neural Networks – Understanding the Difference Between Error Rate and Empirical Risk

neural networkspac-learningprobabilityterminology

Motivation

I am currently reading the following book: Understanding machine learning by Shalev-Shwartz and Ben-David. The book uses statistics terminology in its machine learning theory, and it is not clear to me how to reconcile the stat terms with terms used in coding practice. (I do not restrict my question to definitions from the book, they should only serve as a reference point for what I mean by stat terms.)

I would like to be precise when I use the following terms that I encounter in statistics or in practice:

Question

Define:

error, loss, risk, empirical risk

error rate, accuracy

Using context:

I divide the sample dataset into train set and test set,then train on the train set and do the empirical risk (?) assessment on the test set. In practice, we often call that error rate. However the error in statistics is simply the distance between a single prediction and the correct prediction, risk has to do something with an expected value.

To be more specific about my motivation (this is not part of my question) I would like to paraphrase agnostic PAC learnability such that the terms align both with usual statistic definitions and e.g. common pytorch coding practices. For this reason, I need to be precise with the terms above.

Can you help me on this quest?

Best Answer

I think ChatGPT 3.5 gave a comprehensive, concise and correct answer. (I prompted it with the question as of 26 Feb, 2024.) Here it is:

Certainly! Let's clarify the terms:

  1. Error:

    • Statistical Perspective: The difference between the true underlying distribution and the model's prediction.
    • Practical Perspective: Often used interchangeably with loss or empirical risk in machine learning. It represents the discrepancy between predicted and true values.
  2. Loss:

    • Statistical Perspective: A function that quantifies the difference between predicted and true values for a single data point.
    • Practical Perspective: Typically refers to the objective function that the model aims to minimize during training.
  3. Risk:

    • Statistical Perspective: The expected value of the loss over all possible data points, indicating the average performance of the model.
    • Practical Perspective: Similar to the statistical perspective but might be approximated using the empirical risk on a finite dataset.
  4. Empirical Risk:

    • Statistical Perspective: The average loss over the observed data points in a dataset.
    • Practical Perspective: Often referred to as training loss, calculated on the training set to update the model during training.
  5. Error Rate:

    • Statistical/Practical Perspective: Represents the proportion of incorrectly classified instances in a dataset. Commonly used in classification problems.
  6. Accuracy:

    • Statistical/Practical Perspective: The proportion of correctly classified instances in a dataset. It is the complement of the error rate.

In your context, when you divide the dataset into a train set and a test set, training on the train set and assessing on the test set involves computing the empirical risk (training loss) during training and evaluating the model's performance in terms of error rate or accuracy on the test set. The key is to understand the statistical concepts and how they align with machine learning terminology in practice.