Solved – Contingency Table from ANOVA

anova

I am needing clarification on something that I thought I read was possible and is stuck in my head needing answered as I am not following how to do it or I just missing a key concept. I am not a statistics guy by nature and something obvious may be not clicking so any assistance is appreciated.

Is it possible to construct a 2×2 contingency table (TP/FP/TN/FN) from a ANOVA table that contains the Sum of Squares from the Regression (SSR), Sum of Squares for the Error (SSE), Total Sum of Squares (SST) with a given sample size of X and a regression equation?

If so, please provide a step-by-step example of the steps to do it?

Best Answer

The short answer, no, at least to my knowledge.

I think you are confused about a few things here including what a contingency table is. A contingency table is nothing more than a table that displays frequency distributions. When you refer to TP,FP,TN,&FN you are referring to a hat you describe is a special contingency table known as a confusion matrix. A confusion matrix is (in binary situations) a 2x2 matrix containing the predicted group memberships. For example:

   A  B      A   B
A  5  2    A TP  FP
B  3  8    B FN  TN

This is typically used to evaluate the performance of a predictive algorithm. In this case there are 5 true positives, 2 false positives, 3 false negatives, and 8 true negatives. You can ultimately use this table to derive many other statistics (e.g. Accuracy, Kappa, Sensitivity, etc).

The generic contingency table is simply a representation of the distribution of groups in a population. For example (copying from wikipedia):

         Right-handed  Left-handed  Total
Males    43            9            52
Females  44            4            48
Total    87            13           100

Now, this is, more-or-less, raw data that you can use to test a hypothesis. Assuming the data fits some assumptions, some tests include chi-squared, fisher's, and the G-test among others.

This is different from ANOVA (which in this case would be two-way). ANOVA uses some quantitative value as opposed to just counts. Your output is intended to be used to interpret how the quantitative variable is impacted by the factor.