I am using AICc for model selection for transformed data (continous variable). One model I used $\log_{10}(PWV)$ as response and the other $\log(PWV)$ as response, but I am not sure which one to use if I compare the AICc.
Se underneath:
Response Log10(PWV)
AICc BIC
-2745,2 -2699,71
Response Log(PWV)
AICc BIC
907,8648 953,3541
Should I use the model with the value that is smallest absolute value no matter negative or positive, that is, the $\log(PWV)$ or the model with the smallest value, that is $\log_{10}(PWV)$.
I also get values of 2 LogLikelihood, which I am not sure whether I can use instead for to comparison.
Ekstra information about my very exciting study:
Using SAS. JMP to my mixed model. My study is to validate an apparatus that measures PWV (pulse wave velocity), and to show reproducebility and intra/interobserver variation.
We measured 8 minipigs, two observers, several times on each pig, and we repeated this 3 times on different days.
So have an unbalanced data set, with Pigs as random effect, Observer (2 levels) and examinationday (3 levels) as fixed effect, and then a lot of variables.
My data shown no normaldistribution by Shapiro-Wilks test, so I tried to transform by log(PWV), log10(PWV) and root(PWV) and then make the models with transformed data to se if residual error is normaldistributed.
But don't really understand how to do this in JMP, so I'm looking for the answer to do this. How do I check residual error for normaldistribution in JMP?
Further what do I use I my residual aren't normaldistributed? Should I use Wilkinson, and is that fine with a repeated measurement as mine?
This is the test with PWV not transformed.
***Response PWV***
Summary of Fit
RSquare 0,656081
RSquare Adj 0,651699
Root Mean Square Error 0,955595
Mean of Response 6,236935
Observations (or Sum Wgts) 796
AICc BIC
2256,695 2317,064
Parameter Estimates
Term Estimate Std Error DFDen t Ratio Prob>|t|
Intercept 5,9679039 0,665618 2,001 8,97 0,0122
W[26] -2,485353 1,603072 2,001 -1,55 0,2611
W[27,5] 0,7102739 1,602099 1,996 0,44 0,7009
W[28] -0,069661 1,602321 1,997 -0,04 0,9693
W[28,5] 1,3252744 1,60304 2,001 0,83 0,4953
W[29] 0,9563412 1,226491 1,995 0,78 0,5173
E[1] 0,5646209 0,061558 783 9,17 <,0001
E[2] -0,444435 0,055241 783,1 -8,05 <,0001
O[1] -0,191662 0,039188 783 -4,89 <,0001
E[1]*O[1] 0,2489414 0,060013 783 4,15 <,0001
E[2]*O[1] -0,25976 0,054522 783,1 -4,76 <,0001
REML Variance Component Estimates
Random Effect Var Ratio Var Component Std Error 95% Lower 95% Upper Pct of Total
P[W] 3,4790351 3,1769234 3,1893213 -3,074031 9,4278783 77,674
Residual 0,9131622 0,0461511 0,8290524 1,0108232 22,326
Total 4,0900856 100,000
-2 LogLikelihood =
2230,229647
Fixed Effect Tests
Source Nparm DF DFDen F Ratio Prob > F
W 5 5 1,994 0,6658 0,6917
E 2 2 783,1 45,5276 <,0001
O 1 1 783 23,9207 <,0001
E*O 2 2 783,1 12,2427 <,0001
Here is another test with transformed PWV.
Response Log(PWV)
Summary of Fit
RSquare 0,693231
RSquare Adj 0,689323
Root Mean Square Error 0,151507
Mean of Response 1,795197
Observations (or Sum Wgts) 796
AICc BIC
-634,624 -574,255
Parameter Estimates
Term Estimate Std Error DFDen t Ratio Prob>|t|
Intercept 1,7446075 0,109662 1,999 15,91 0,0039
W[26] -0,484959 0,26411 2 -1,84 0,2078
W[27,5] 0,1478977 0,263961 1,995 0,56 0,6318
W[28] 0,01762 0,263995 1,996 0,07 0,9529
W[28,5] 0,2290124 0,264105 2 0,87 0,4773
W[29] 0,1623861 0,202078 1,994 0,80 0,5062
E[1] 0,082577 0,00976 783 8,46 <,0001
E[2] -0,060043 0,008758 783,1 -6,86 <,0001
O[1] -0,03304 0,006213 783 -5,32 <,0001
E[1]*O[1] 0,0288572 0,009515 783 3,03 0,0025
E[2]*O[1] -0,03895 0,008644 783,1 -4,51 <,0001
REML Variance Component Estimates
Random Effect Var Ratio Var Component Std Error 95% Lower 95% Upper Pct of Total
P[W] 3,7578306 0,0862586 0,086597 -0,083468 0,2559857 78,982
Residual 0,0229544 0,0011601 0,0208401 0,0254093 21,018
Total 0,109213 100,000
-2 LogLikelihood =
-661,0898412
Fixed Effect Tests
Source Nparm DF DFDen F Ratio Prob > F
W 5 5 1,996 0,8590 0,6156
E 2 2 783,1 37,1920 <,0001
O 1 1 783 28,2795 <,0001
E*O 2 2 783,1 10,1548 <,0001
Generally the Residual by Predicted Plot doesn't look really nice, but they aren't really bad. But I want to now how to see whether the residual is normaldistributed?
Hope to get some help.
Best Answer
(1) The base of the logarithm won't necessarily affect the structure of your model - changing the base is equivalent to using different units of measurement for the response. If the model is invariant to using different units of measurement then only the interpretation of coefficients changes when you transform.
(2) AICs are not comparable for models with different response variables: unless you can work out what the effect of a transformation will be on the likelihood it's best to follow the usual rule-of-thumb & not compare AICs of the same data-set with different transformations.