Solved – Negative and positive AICc/BIC for two models with transformed data – how to compare

aicbicdata transformationmixed modelrepeated measures

I am using AICc for model selection for transformed data (continous variable). One model I used $\log_{10}(PWV)$ as response and the other $\log(PWV)$ as response, but I am not sure which one to use if I compare the AICc.

Se underneath:

Response Log10(PWV)
AICc    BIC
-2745,2 -2699,71

Response Log(PWV)
AICc    BIC
907,8648    953,3541

Should I use the model with the value that is smallest absolute value no matter negative or positive, that is, the $\log(PWV)$ or the model with the smallest value, that is $\log_{10}(PWV)$.
I also get values of 2 LogLikelihood, which I am not sure whether I can use instead for to comparison.

Ekstra information about my very exciting study:

Using SAS. JMP to my mixed model. My study is to validate an apparatus that measures PWV (pulse wave velocity), and to show reproducebility and intra/interobserver variation.
We measured 8 minipigs, two observers, several times on each pig, and we repeated this 3 times on different days.
So have an unbalanced data set, with Pigs as random effect, Observer (2 levels) and examinationday (3 levels) as fixed effect, and then a lot of variables.
My data shown no normaldistribution by Shapiro-Wilks test, so I tried to transform by log(PWV), log10(PWV) and root(PWV) and then make the models with transformed data to se if residual error is normaldistributed.

But don't really understand how to do this in JMP, so I'm looking for the answer to do this. How do I check residual error for normaldistribution in JMP?
Further what do I use I my residual aren't normaldistributed? Should I use Wilkinson, and is that fine with a repeated measurement as mine?

This is the test with PWV not transformed.

***Response PWV***

Summary of Fit

RSquare 0,656081
RSquare Adj 0,651699
Root Mean Square Error  0,955595
Mean of Response    6,236935
Observations (or Sum Wgts)  796

AICc    BIC
2256,695    2317,064

Parameter Estimates

Term        Estimate    Std Error   DFDen   t Ratio Prob>|t|
Intercept       5,9679039   0,665618    2,001   8,97    0,0122
W[26]       -2,485353   1,603072    2,001   -1,55   0,2611
W[27,5]     0,7102739   1,602099    1,996   0,44    0,7009
W[28]       -0,069661   1,602321    1,997   -0,04   0,9693
W[28,5]     1,3252744   1,60304 2,001   0,83    0,4953
W[29]       0,9563412   1,226491    1,995   0,78    0,5173
E[1]        0,5646209   0,061558    783 9,17    <,0001
E[2]        -0,444435   0,055241    783,1   -8,05   <,0001
O[1]        -0,191662   0,039188    783 -4,89   <,0001
E[1]*O[1]       0,2489414   0,060013    783 4,15    <,0001
E[2]*O[1]       -0,25976    0,054522    783,1   -4,76   <,0001


REML Variance Component Estimates

Random Effect   Var Ratio   Var Component   Std Error   95% Lower   95% Upper   Pct of Total
P[W]    3,4790351   3,1769234   3,1893213   -3,074031   9,4278783   77,674
Residual        0,9131622   0,0461511   0,8290524   1,0108232   22,326
Total       4,0900856               100,000


  -2 LogLikelihood = 
2230,229647

Fixed Effect Tests

Source  Nparm   DF  DFDen   F Ratio Prob > F     
W   5   5   1,994   0,6658  0,6917  
E   2   2   783,1   45,5276 <,0001  
O   1   1   783 23,9207 <,0001  
E*O 2   2   783,1   12,2427 <,0001  

Here is another test with transformed PWV.

Response Log(PWV)

Summary of Fit

RSquare 0,693231
RSquare Adj 0,689323
Root Mean Square Error  0,151507
Mean of Response    1,795197
Observations (or Sum Wgts)  796

AICc    BIC
-634,624    -574,255

Parameter Estimates

Term        Estimate    Std Error   DFDen   t Ratio Prob>|t|
Intercept       1,7446075   0,109662    1,999   15,91   0,0039
W[26]       -0,484959   0,26411 2   -1,84   0,2078
W[27,5]     0,1478977   0,263961    1,995   0,56    0,6318
W[28]       0,01762 0,263995    1,996   0,07    0,9529
W[28,5]     0,2290124   0,264105    2   0,87    0,4773
W[29]       0,1623861   0,202078    1,994   0,80    0,5062
E[1]        0,082577    0,00976 783 8,46    <,0001
E[2]        -0,060043   0,008758    783,1   -6,86   <,0001
O[1]        -0,03304    0,006213    783 -5,32   <,0001
E[1]*O[1]       0,0288572   0,009515    783 3,03    0,0025
E[2]*O[1]       -0,03895    0,008644    783,1   -4,51   <,0001


REML Variance Component Estimates

Random Effect   Var Ratio   Var Component   Std Error   95% Lower   95% Upper   Pct of Total
P[W]    3,7578306   0,0862586   0,086597    -0,083468   0,2559857   78,982
Residual        0,0229544   0,0011601   0,0208401   0,0254093   21,018
Total       0,109213                100,000

  -2 LogLikelihood = 
-661,0898412


Fixed Effect Tests

Source  Nparm   DF  DFDen   F Ratio Prob > F     
W   5   5   1,996   0,8590  0,6156  
E   2   2   783,1   37,1920 <,0001  
O   1   1   783 28,2795 <,0001  
E*O 2   2   783,1   10,1548 <,0001  

Generally the Residual by Predicted Plot doesn't look really nice, but they aren't really bad. But I want to now how to see whether the residual is normaldistributed?

Hope to get some help.

Best Answer

(1) The base of the logarithm won't necessarily affect the structure of your model - changing the base is equivalent to using different units of measurement for the response. If the model is invariant to using different units of measurement then only the interpretation of coefficients changes when you transform.

(2) AICs are not comparable for models with different response variables: unless you can work out what the effect of a transformation will be on the likelihood it's best to follow the usual rule-of-thumb & not compare AICs of the same data-set with different transformations.

Related Question