Bishop – Pattern Recognition & Machine Learning, Exercise 1.2

I'm working on exercise 1.2 (Curve Fitting Problem) of Bishop's Pattern Recognition and Machine Learning book.

You should write the linear equations, satisfied by the coefficients, that minimize the regularized sum-of-squares error function $\tilde E(w) = \frac{1}{2}\sum_{n = 1}^N (y(x_n, w) – t_n)^2 + \frac{\lambda}{2}\|w\|^2$ with $y(x, w) = \sum_{j = 0}^M w_jx^j$ and $\|w\|^2 = w^Tw$ for given data $(x_n, t_n)$.

Similar to exercise 1.1, I started with the partial derivative for the weight $w_i$ with $A_{ij} = \sum_{n = 1}^N (x_n)^{i + j}$ and $T_i = \sum_{n = 1}^N (x_n)^it_n$ from the first exercise:
$$
\frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) – t_n)^2 + \lambda / 2 \|w\|^2) \\
= \frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) – t_n)^2) + \frac{\partial \tilde E}{\partial w_i}( \lambda / 2 \|w\|^2) \\
= \sum_{j = 0}^M A_{ij}w_j – T_i + \lambda / 2 \frac{\partial \tilde E}{\partial w_i}(\|w\|^2) \\
= \sum_{j = 0}^M A_{ij}w_j – T_i + \lambda / 2 \cdot 2w_i = \sum_{j = 0}^M A_{ij}w_j – T_i + \lambda w_i
$$

Now I could set this derivative to zero and get
$$
\sum_{j = 0}^M (A_{ij}w_j) + \lambda w_i = T_i
$$

And at this point I've no idea, how to get this equation in a form of a linear system for using for example the gauss algorithm.

Thanks for any help.

Bishop – Pattern Recognition & Machine Learning, Exercise 1.2

Best Answer

Related Question

Best Answer

Related Solutions

Exercise 1.1 from Introduction of Pattern Recognition and Machine Learning by Christopher Bishop

Pattern Recognition and Machine Learning (Bishop) – Exercise 1.28

Related Question