Minimum value of RSS: Why does the the coefficient of $d^2$ being positive tell us that this value is a minimum

maxima-minimaoptimizationquadraticsstatisticssums-of-squares

I am currently studying the textbook Statistical Inference by Casella and Berger. Chapter 11.3.1 Least Squares: A Mathematical Solution says the following:

For any line $y = c + dx$, the residual sum of squares (RSS) is defined to be
$$\text{RSS} = \sum_{i = 1}^n (y_i – (c + dx_i))^2 .$$
The RSS measures the vertical distance from each data point to the line $c + dx$ and then sums the squares of these distances. (Two such distances are shown in Figure 11.3.1). The least squares estimates of $\alpha$ and $\beta$ are defined to be those values $a$ and $b$ such that the line $a + bx$ minimizes RSS. That is, the least squares estimates, $a$ and $b$, satisfy
$$\min_{c, d} \sum_{i = 1}^n (y_i – (c + dx_i))^2 = \sum_{i = 1}^n (y_i – (a + bx_i))^2.$$
This function of two variables, $c$ and $d$, can be minimized in the following way. For any fixed value of $d$, the value of $c$ that gives the minimum value can be found by writing
$$\sum_{i = 1}^n (y_i – (c + dx_i))^2 = \sum_{i = 1}^n ((y_i – dx_i) – c)^2 .$$
From Theorem 5.2.4, the minimizing value of $c$ is
$$c = \dfrac{1}{n} \sum_{i = 1}^n (y_i – dx_i) = \overline{y} – d \overline{x}.$$
Thus, for a given value of $d$, the minimum value of RSS is
$$\sum_{i = 1}^n ((y_i – dx_i) – (\overline{y} – d \overline{x}))^2 = \sum_{i = 1}^n ((y_i – \overline{y}) – d(x_i – \overline{x}))^2 = S_{yy} – 2dS_{xy} + d^2 S_{xx}.$$
The value of $d$ that gives the overall minimum value of RSS is obtained by setting the derivative of this quadratic function of $d$ equal to $0$. The minimizing value is
$$d = \dfrac{S_{xy}}{S_{xx}}.$$
This value is, indeed, a minimum since the coefficient of $d^2$ is positive.

I am confused by this part:

This value is, indeed, a minimum since the coefficient of $d^2$ is positive.

Why does the the coefficient of $d^2$ being positive tell us that this value is a minimum?

Best Answer

The function of interest is: $$f(d) =S_{yy} - 2dS_{xy} + d^2 S_{xx}$$ This is a quadratic function in $d$ if $S_{xx}\neq 0$. The coefficient of $d$ (here: $S_{xx}$) determines whether the function is concave or convex. When $S_{xx}>0$, the function is convex (think ($g(x) = x^2$) while for $S_{xx}<0$ the function is concave (think ($h(x) = -x^2$). A stationary point of a convex function corresponds to a global minimum.

Related Question