Easiest to explain by way of an example:

Imagine study finds that people who watched the World Cup final were more likely to suffer a heart attack during the match or in the subsequent 24 hours than those who didn't watch it. Should the government ban football from TV? But men are more likely to watch football than women, and men are also more likely to have a heart attack than women. So the *association* between football-watching and heart attacks might be explained by a *third factor* such as sex that affects both. (Sociologists would distinguish here between *gender*, a cultural construct that is associated with football-watching, and *sex*, a biological category that is associated with heart-attack incidence, but the two are cleary very strongly correlated so i'm going to ignore that distinction for simplicity.)

Statisticians, and especially epidemiologists, call such a third factor a *confounder*, and the phenomenon *confounding*. The most obvious way to remove the problem is to look at the association between football-watching and heart-attack incidence in men and women separately, or in the jargon, to *stratify* by sex. If we find that the association (if there still is one) is similar in both sexes, we may then choose to combine the two estimates of the association across the two sexes. The resulting estimate of the association between football-watching and heart-attack incidence is then said to be *adjusted* or *controlled* for sex.

We would probably also wish to control for other factors in the same way. Age is another obvious one (in fact epidemiologists either stratify or adjust/control almost every association by age and sex). Socio-economic class is probably another. Others can get trickier, e.g. should we adjust for beer consumption while watching the match? Maybe yes, if we're interested in the effect of the stress of watching the match alone; but maybe no, if we're considering banning broadcasting of World Cup football and that would also reduce beer consumption. Whether given variable is a confounder or not depends on precisely what question we wish to address, and this can require very careful thought and get quite tricky and even contentious.

Clearly then, we may wish to adjust/control for several factors, some of which may be measured in several categories (e.g. social class) while others may be continuous (e.g. age). We could deal with the continuous ones by splitting into (age-)groups, thereby turning them into categorical ones. So say we have 2 sexes, 5 social class groups and 7 age groups. We can now look at the association between football-watching and heart-attack incidence in 2×5×7 = 70 strata. But if our study is fairly small, so some of those strata contain very few people, we're going to run into problems with this approach. And in practice we may wish to adjust for a dozen or more variables. An alternative way of adjusting/controlling for variables that is particularly useful when there are many of them is provided by *regression analysis* with multiple dependent variables, sometimes known as *multivariable* regression analysis. (There are different types of regression models depending on the type of outcome variable: least squares regression, logistic regression, proportional hazards (Cox) regression...). In observational studies, as opposed to experiments, we nearly always want to adjust for many potential confounders, so in practice adjustment/control for confounders is often done by regression analysis, though there are other alternatives too though, such as standardization, weighting, propensity score matching...

## Best Answer

It helps to view regression as a linear approximation of the true form. Suppose the true relationship is

$$y=f(x_1,...,x_k)$$

with $x_1,...,x_k$ factors explaining the $y$. Then first order Taylor approximation of $f$ around zero is:

$$f(x_1,...,x_k)=f(0,...,0)+\sum_{i=1}^{k}\frac{\partial f(0)}{\partial x_k}x_k+\varepsilon,$$

where $\varepsilon$ is the approximation error. Now denote $\alpha_0=f(0,...,0)$ and $\alpha_k=\frac{\partial{f}(0)}{\partial x_k}$ and you have a regression:

$$y=\alpha_0+\alpha_1 x_1+...+\alpha_k x_k + \varepsilon$$

So although you do not know the true relationship, if $\varepsilon$ is small you get approximation, from which you can still deduce useful conclusions.