[Math] find the curve $y = ax + bx^2$ that best fits the data using method of least squares

curvesleast squareslinear algebra

There's a problem in curve fitting section,

Q) By the method of least squares, find the curve $y = ax + bx^2$ that best fits the following data.

x   1    2    3    4     5
y  1.8  5.1  8.9  14.1  19.8

I try to solve it by using this fitting a straight line formula,

$\sum_{i=1}^n y_i = na * b\sum_{i=1}^nx_i$
$\sum_{i=1}^nx_iy_i = a\sum_{i=1}^nx_i+b\sum_{i=1}^nx_i^2$

I got $a = 3.56, b = 4.5$ which seems to be wrong? which formula should I use to solve this?

Best Answer

You should use

$$\sum_{i=1}^n x_i y_i = a\sum_{i=1}^n x_i^2 +b\sum_{i=1}^nx_i^3$$ $$\sum_{i=1}^n x_i^2 y_i = a\sum_{i=1}^n x_i^3 +b\sum_{i=1}^nx_i^4$$

This is derived by starting with an expression for the Sum of the Squared Errors

$$SSE = \sum_{i=1}^n (y_i - \hat{y}_i)^2$$ $$SSE = \sum_{i=1}^n (y_i - (ax_i+bx_i^2))^2$$

Take partial derivatives with respect to $a$ and $b$ and set each to $0$:

$$\frac{\partial SSE}{\partial a} = \sum_{i=1}^n \left[2(y_i - (ax_i+bx_i^2))(-x_i)\right] \equiv 0$$ $$\frac{\partial SSE}{\partial b} = \sum_{i=1}^n \left[2(y_i - (ax_i+bx_i^2))(-x_i^2)\right] \equiv 0$$

Then with some rearranging and simplification you get $$-\sum_{i=1}^n x_i y_i + a\sum_{i=1}^n x_i^2 + b\sum_{i=1}^n x_i^3 = 0$$ $$-\sum_{i=1}^n x_i^2 y_i + a\sum_{i=1}^n x_i^3 + b\sum_{i=1}^n x_i^4 = 0$$

Solving the equations gives you: $$a = \frac{\sum x_i^4 \sum x_i y_i - \sum x_i^3 \sum x_i^2 y_i}{\sum x_i^4 \sum x_i^2 - \sum x_i^3 \sum x_i^3}$$ $$b = \frac{\sum x_i^2 \sum x_i^2 y_i - \sum x_i^3 \sum x_i y_i}{\sum x_i^4 \sum x_i^2 - \sum x_i^3 \sum x_i^3}$$

Hope that helps

Related Question