Solved – What does LS (least square) means refer to

least squaresmean

I have been reading clinical papers and recently come across the term "LS-means", referring to what seems to me as an estimation of some population's mean measure. Obviously, I know what "mean" refers to and I know when one estimates a mean for a population from a sample, one has to put some measure of confidence to it, or a measure of standard error, otherwise it's just a number – this does not seem to be the case with LS-means measure (at least not in the papers I encountered, maybe they just did a sloppy job, I don't have enough knowledge to tell). I also know what "least square" refers to when it comes to regression models or optimization problems.

I have never encountered the combination "LS-mean". I admit that my background in statistics may be lacking since it is not my primary field of occupation. A quick search of online sources doesn't seem to yield a satisfactory explanation of what does this combination of words actually refers to (regression? estimation?) and how does least-squares method (of what I can only assume is optimization) fit with calculating an average measure of a population.

Example of something one may find in clinical literature (paraphrasing):
A clinical trial of a drug was conducted against a matched placebo control group. This trial lasted several weeks. Each week the subjects were tested for symptom severity. The results of the study were presented in a chart "improvement from baseline vs week of treatment". X axis – weeks, Y axis "LS-means change". Change in placebo group was significantly smaller than drug group – hurrah our drug great.

I know that this question is very broad, so to limit the discussion, these are the things I am looking to find out:

(1) Can anyone tell me what "LS-mean" may be referring to in the context of clinical trials (or any experimental work for that matter). It doesn't have to be long and/or exhaustive – a few bullet points with keywords would be great to start my self-education journey.

(2) A good online source or a book for getting up to speed on the topic of "LS-means", whatever it may be referring to.

Thanks.

Best Answer

Consider the model:

$$ y = \beta_0 + \beta_1 \text{treatment} + \beta_2 \text{block} + \beta_3 \text{year} $$

Where $y$ is some outcome of interest, treatment is a treatment factor, block is a blocking factor and year is the year (a factor) where the experiment is repeated over several years.

We would like to recover $E(Y|\text{treatment})$, but it cannot done from this model. Instead we find $E(Y|\text{treatment}, \text{block}, \text{year})$.

However we could average the fitted value from $E(Y|\text{treatment}, \text{block}, \text{year})$, over block and year, and then think of it as $E(Y|\text{treatment})$. If the model is estimated by least squares (OLS in the linear case), this is the LS-mean (of treatment, in this case).

For a reference on implementation (in R) see this pdf it also covers LS-means from the common models.

Related Question