Solved – Calculating weight as 1/(stnd error) for weighted regression if stnd error = 0

weighted-regression

I'm assisting a colleague with a weighted regression of average length of stay (LOS) measured in days vs. inpatient admit rate in a dataset consisting of inpatient records from 30 hospitals. We've calculated the weight at each data point as the inverse standard error of patient's LOS at the hospital.

Is there a standard procedure for estimating the inverse stnd error weight where the stnd error = 0? In several hospitals with low patient volume the LOS is identical for all patients during our analysis time period, and therefore the stnd error = 0, producing a weight equal to infinity.

We could drop these data points from the regression (or avoid a weighted regression entirely), but in principle it seems there ought to be an accepted technique for calculating weights in special cases where the variance = 0. I haven't had any luck checking my stats textbooks.


Thank you for the advice, Michael and whuber. Most of the total error sum of squares can be attributed to measurement error in my case when I run a simple unweighted least squares regression (RSS=44.5, ESS=168.9, TSS=213.4).

So if I were to construct my own weighting scheme it might entail, at one extreme if variance=0 then the weight = # obs in that hospital, and at the other extreme, if variance=infinity, the weight=0.

Perhaps a handy formula could be weight_i = N_i/(N_i^CV_i), where weight_i = weight for hospital i, N_i = # obs for hospital i, and CV_i = Coefficient of Variation of observed LOS for hospital i?

ALOS vs. Admit Rate

Best Answer

The problem you have is that you are using the estimated standard error in the denominator of the weight. The population standard deviation is not likely to be 0 in real situations. I do not think there is a standard way that applies this particular weighting scheme and some other weight when to estimated standard error is 0. The solution is to take a different weighting scheme. There is no law that says that you must take the reciprocal of the standard errors as the weights. Under certain assumptions those would be the optimal weights. But that is not the case here.

Related Question