[Math] Subtraction of slope in gradient descent

calculusmachine learningoptimization

In the gradient descent algorithm say $f(x)$ (quadratic function) is the objective function. SO the algorithm is defined as

$$x_i = x_i – a\frac{\partial f(x)}{\partial x_i}$$

I Just dont quite understand the meaning of doing a subtraction. I'm intuitively able to follow that we are going in the direction of steepest descent but have some questions. The derivative of $f(x)$ is going to give us the equation of a line. So when we substitute the value of $x_i$ in $f'(x)$ , what we get is a $y$ coordinate: $y_i$. So I dont understand how we subtract a $y$ coordinate from an $x$ coordinate ?

Best Answer

The direction of $\nabla f$ is the direction of greatest increase of $f$. (This can be shown by writing out the directional derivative of $f$ using the chain rule, and comparing the result with a dot product of the direction vector with the gradient vector.) You want to go toward the direction of greatest decrease, so move along $-\nabla f$.

Related Question