[Math] Derivative of a summation in order to minimize

derivativeslinear algebrasummation

I am asked to minimize $\sum^n_{i=0}(x_i – C)^2$ with respect only to C so I know I have to take the derivative respect to C, set it equal to 0, and then solve. I have never done summation in my life and this is very new. I have been trying to search the web for information about how to proceed in these cases but all I have found is long theorems of summation or explanations of very simple operations with the summation notation. I have found, nevertheless, the answer to my question which is as follows:
$$S = \sum^n_{i=0}(x_i – C)^2$$

$$\frac{\partial S}{\partial C} = \sum^n_{i=0} 2 (x_i – C)(-1) = -2 \sum^n_{i=0} (x_i – C)$$

$$\frac{\partial S}{\partial C} = 0 \implies \sum^n_{i=0} (x_i – C) = 0$$

$$ \sum^n_{i=0} x_i – \sum^n_{i=0} C = 0$$

$$ \sum^n_{i=0} x_i = \sum^n_{i=0} C = nC$$

$$C = \frac{\sum^n_{i=0} X_i}{n}$$

First step, I don't get what this guy is doing, so when he takes the derivative why he puts that (-1) at the end? is it because this is originally the square of a difference? That's the only thing I can think of..

Second step, where has the (-2) gone? It just vanished.

Third step, I get this one.

Fourth, where is this $nC$ coming from? What does it mean?
I understand that at the end what he is doing is $ \sum^n_{i=0} x_i= nC$ so it isolates $C$ and he finally gets the result $C = \frac{\sum^n_{i=0} X_i}{n}$. Any reasons though why he chooses to swap $\sum^n_{i=0} C$ instead of $\sum^n_{i=0} x_i$ for $nC$?

Thanks a lot, I'd really appreciate if you could please refer me to any page you might know about summation which doesn't go about 1000 theorems and properties . Unfortunately I can spend much time on summation as I have many other topics in my exam that I need to cover and this is just a small part of it. Cheers!

Best Answer

First step

Think of the sum as a function. To find a minima/maxima for a certain function we need to find it's derivative and set it to 0. And because we have 2 terms in between the parenthesis, we can't just apply the rule $\frac{\partial}{\partial x} x^n = nx^{n-1}$, but instead we apply the chain rule. So that -2 is from the chain rule.

Second step

Let's denote the derivative of the sum as $S_1$, the when $2S_1 = 0$? It's only possible if $S_1 =0$, so we left out the 2.

Fourth step

$C$ is a constant that's independent from $n$ so after every step we just add $C$. If $n$ is the number of steps then we have added $nC$