Minimum value of RSS: Why does the the coefficient of $d^2$ being positive tell us that this value is a minimum

I am currently studying the textbook Statistical Inference by Casella and Berger. Chapter 11.3.1 Least Squares: A Mathematical Solution says the following:

For any line $y = c + dx$, the residual sum of squares (RSS) is defined to be
$$\text{RSS} = \sum_{i = 1}^n (y_i – (c + dx_i))^2 .$$
The RSS measures the vertical distance from each data point to the line $c + dx$ and then sums the squares of these distances. (Two such distances are shown in Figure 11.3.1). The least squares estimates of $\alpha$ and $\beta$ are defined to be those values $a$ and $b$ such that the line $a + bx$ minimizes RSS. That is, the least squares estimates, $a$ and $b$, satisfy
$$\min_{c, d} \sum_{i = 1}^n (y_i – (c + dx_i))^2 = \sum_{i = 1}^n (y_i – (a + bx_i))^2.$$
This function of two variables, $c$ and $d$, can be minimized in the following way. For any fixed value of $d$, the value of $c$ that gives the minimum value can be found by writing
$$\sum_{i = 1}^n (y_i – (c + dx_i))^2 = \sum_{i = 1}^n ((y_i – dx_i) – c)^2 .$$
From Theorem 5.2.4, the minimizing value of $c$ is
$$c = \dfrac{1}{n} \sum_{i = 1}^n (y_i – dx_i) = \overline{y} – d \overline{x}.$$
Thus, for a given value of $d$, the minimum value of RSS is
$$\sum_{i = 1}^n ((y_i – dx_i) – (\overline{y} – d \overline{x}))^2 = \sum_{i = 1}^n ((y_i – \overline{y}) – d(x_i – \overline{x}))^2 = S_{yy} – 2dS_{xy} + d^2 S_{xx}.$$
The value of $d$ that gives the overall minimum value of RSS is obtained by setting the derivative of this quadratic function of $d$ equal to $0$. The minimizing value is
$$d = \dfrac{S_{xy}}{S_{xx}}.$$
This value is, indeed, a minimum since the coefficient of $d^2$ is positive.

I am confused by this part:

This value is, indeed, a minimum since the coefficient of $d^2$ is positive.

Why does the the coefficient of $d^2$ being positive tell us that this value is a minimum?

Best Answer

The function of interest is: $$f(d) =S_{yy} - 2dS_{xy} + d^2 S_{xx}$$ This is a quadratic function in $d$ if $S_{xx}\neq 0$. The coefficient of $d$ (here: $S_{xx}$) determines whether the function is concave or convex. When $S_{xx}>0$, the function is convex (think ($g(x) = x^2$) while for $S_{xx}<0$ the function is concave (think ($h(x) = -x^2$). A stationary point of a convex function corresponds to a global minimum.

Best Answer

Related Solutions

Why isn’t the linear regression coefficient not just the average vector to data points

Minimizing RSS for model with missing observations. Dumthe variable vs Dropping observations

Related Question