There's a problem in curve fitting section,
Q) By the method of least squares, find the curve $y = ax + bx^2$ that best fits the following data.
x 1 2 3 4 5
y 1.8 5.1 8.9 14.1 19.8
I try to solve it by using this fitting a straight line formula,
$\sum_{i=1}^n y_i = na * b\sum_{i=1}^nx_i$
$\sum_{i=1}^nx_iy_i = a\sum_{i=1}^nx_i+b\sum_{i=1}^nx_i^2$
I got $a = 3.56, b = 4.5$ which seems to be wrong? which formula should I use to solve this?
Best Answer
You should use
$$\sum_{i=1}^n x_i y_i = a\sum_{i=1}^n x_i^2 +b\sum_{i=1}^nx_i^3$$ $$\sum_{i=1}^n x_i^2 y_i = a\sum_{i=1}^n x_i^3 +b\sum_{i=1}^nx_i^4$$
This is derived by starting with an expression for the Sum of the Squared Errors
$$SSE = \sum_{i=1}^n (y_i - \hat{y}_i)^2$$ $$SSE = \sum_{i=1}^n (y_i - (ax_i+bx_i^2))^2$$
Take partial derivatives with respect to $a$ and $b$ and set each to $0$:
$$\frac{\partial SSE}{\partial a} = \sum_{i=1}^n \left[2(y_i - (ax_i+bx_i^2))(-x_i)\right] \equiv 0$$ $$\frac{\partial SSE}{\partial b} = \sum_{i=1}^n \left[2(y_i - (ax_i+bx_i^2))(-x_i^2)\right] \equiv 0$$
Then with some rearranging and simplification you get $$-\sum_{i=1}^n x_i y_i + a\sum_{i=1}^n x_i^2 + b\sum_{i=1}^n x_i^3 = 0$$ $$-\sum_{i=1}^n x_i^2 y_i + a\sum_{i=1}^n x_i^3 + b\sum_{i=1}^n x_i^4 = 0$$
Solving the equations gives you: $$a = \frac{\sum x_i^4 \sum x_i y_i - \sum x_i^3 \sum x_i^2 y_i}{\sum x_i^4 \sum x_i^2 - \sum x_i^3 \sum x_i^3}$$ $$b = \frac{\sum x_i^2 \sum x_i^2 y_i - \sum x_i^3 \sum x_i y_i}{\sum x_i^4 \sum x_i^2 - \sum x_i^3 \sum x_i^3}$$
Hope that helps