Solved – Polynomial Regression using Gradient Descent for approximation of a sine in python

gradient descentpolynomialpythonregression

According to answers of this topic: Polynomial regression using gradient descent I've written a code for polynomial regression of this data: http://illumina.ir/ML/data1.xlsx (the output is simply a sine function).

import pandas 
import numpy as np
import matplotlib.pyplot as plt

def gradientDescent(x, y, theta, alpha, n, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y
        cost = np.sum(loss ** 2) / (2 * n)
        #print("Iteration %d | Cost: %f" % (i, cost))
        gradient = np.dot(xTrans, loss) / n
        theta = theta - alpha * gradient
    return theta

ExcelData = pandas.read_excel('data1.xlsx', 'Sheet1')

xs=ExcelData[0]
#xs = xs / np.linalg.norm(xs) #normalize input examples
xs = xs / xs[99] #normalize input examples
ys=ExcelData['0.1']

m=7 #polynomial degree
n=len(ys) #number of examples
numIterations= 100000 #number of iterations
alpha = 0.01

y = np.zeros(shape=n)
x = np.zeros(shape=(n, m))
theta = np.ones(m)

for i in range(0, n):
    y[i] = ys[i]

for i in range(0, n):
    for j in range(0, m):
        x[i][j] = np.power(xs[i], j)

theta = gradientDescent(x, y, theta, alpha, n, numIterations)
print(theta)

plt.clf()
plt.plot(y)
plt.plot(np.dot(x, theta))
plt.savefig('testplot.png')

Here is the output:

sample image

The blue curve is the expected output and the orange curve is the approximation.

What's the problem with my code?

Best Answer

The gradient descent code looks fine to me (although you should not fix the number of steps but rather exiting when the gradient is small).

The issue is that the gradient descent converges to a local minimum, which is not necessarily the global minimum. The key points are that

(i) the polynomial fitting converges VERY slowly to the sin function. The 7th order approximation does not converge outside the first period for example (see http://www.wolframalpha.com/input/?i=Taylor+sin+x) and the gradient descent method will not be able to find a much better solution (it requires a higher order polynomial);

(ii) the coefficients of the Taylor series centered in 0.5 have very different magnitudes. You should find theta = [-197, +2766, -15877, +47085, -75701, +62590, -20863, +0]. Supposedly, the same applies for the coefficients of the global minimum of the gradient descent method*. Starting from a guesses of all ones is a very poor initial condition, you are too far away from the actual solution.

Hope this helps

*NB: the Taylor series coefficients are NOT the same coefficients that the gradient descent method should fine. The Taylor series best approximate the function close to the expansion point, with the error increasing away from it; in the least square fitting that underlies the gradient descent tries to minimize the overall error, there is no "favorite point" unless you use weighted sums.)