I'm using a very simple data set from an article in trying to further my understanding of GLMs. I've input the data using SAS, and I've run both the PROC REG and PROC GENMOD procedures on the data. In the PROC GENMOD procedure, I used a log link with a normal distribution; in the PROC REG procedure, I used the log of the response variable in the model.
My question is, why don't the parameter estimates of the two procedures match? My understanding is that PROC REG uses OLS to estimate the parameters, whereas PROC GENMOD uses MLE with a Newton-Raphson iterative process for estimation. But I had thought that, when the assumed distribution is normal and the relationship is linear (which, after the log transformation, it is in the GLM, right?), MLE is equal to OLS.
Here are the resulting parameters from the run:
REG GENMOD
A1 4.623 4.579
A2 4.688 4.730
A3 4.654 4.654
B1 (0.735) (0.741)
B2 (0.487) (0.436)
And here is my code:
data GLM;
input Y A1 A2 A3 B1 B2;
lnY = LOG(Y);
datalines;
95 1 0 0 0 0
115 0 1 0 0 0
105 0 0 1 0 0
55 1 0 0 1 0
45 0 1 0 1 0
30 1 0 0 1 1
;
proc genmod data=GLM;
model Y = A1 A2 A3 B1 B2 / dist=normal link=log scale=deviance noint ;
weight Y;
run;
proc reg data=GLM;
model lnY = A1 A2 A3 B1 B2 / noint;
weight Y;
run;
Any insight that anyone can contribute is greatly appreciated!
Bonus question – in my data I have 6 equations and 5 variables. Why is an iterative process needed to solve that?
Best Answer
"Weight" functions differently in the two PROCS:
In PROC REG "weight" fits
In GENMOD