Summation – Is Sigma ? a Mathematical Way of Doing a For Loop?

notationsummation

I've been a programmer for ten years, and once upon a time I was pretty good at math. Those days are long gone. I'm taking some online classes and now I find myself needing to remember the math I learned in college and it has all gone bye-bye.

Specifically, I have a question about $\Sigma$ (Sigma):

In an $\Sigma$ equation, there is an N above the $\Sigma$, and an $i = 1$ (or some number) below.
If someone could help my programmer mind – this looks like a for loop in programming. Is that essentially what is going on with everything inside the $()$ of the $\Sigma$? It simply computes all the values for each iteration from $i$ to N and then sums them all up?

Best Answer

Well yeah, sort of. But not really, IMO.

What most programming languages do is a awkward, mathematically speaking. An imperative for-loop increments a variable. That shouldn't really be possible: maths doesn't know time. If you define something once then it'll conceptually hold always, i.e. if you start out with $i = 0$ it's not possible to later have $i=3$.

What's really going on, mathematically, is that you write an abstraction, an expression for all $i$. Then you apply to that expression the higher-order operation “sum over all such $i$ in some range”. The variable $i$ never actually has any particular value, it's just a placeholder for “consider value here”.

There is a proper mathematical framework that can nicely express computations like a sum. It's called lambda calculus. In this, such an abstraction over some variable is called a lambda expression, written like* $$ \lambda i \mapsto \frac1{2^i} $$ More commonly, this kind of abstraction would be called a function and written $$ f : \mathbb{N}\to\mathbb{Q}, \quad f(i) = \frac{1}{2^i} $$ or, in programming languages, something like

double f(int i) {
  return 1.0 / 2**i;
}

Now, what a summation operator does is, it takes such a lambda function and yields a number equal to the function evaluated with all possible arguments in a range and summed up. For instance, $$ S = \sum_{i=0}^{5}\frac1{2^i} $$ is in a sense shorthand notation for something $$ S = \Sigma(0,5,f). $$ Note that I didn't write $f(i)$ but just $f$: this is not the result of applying $f$ to any particular $i$, but the function object $f$ itself. It is not a number but, well, a function.

It's not necessary to first give $f$ a name and then use it in the sum, I can also just put in the lambda expression: $$ S = \Sigma(0,5,\lambda i\mapsto 1/2^{i}). $$ I understand sums as a special way of writing this expression.

Now as for how to the summation operator can actually be implemented in a computer? Well: in an imperative language, a for-loop would certainly be a reasonably way to do it. Like,

double sumFromTo(int i0, int iend, function<double(int)> f) {
  int i;
  double result;
  for(i=0; i<=iend; ++i) {
    result += f(i);
  }
  return result;
}

But as I said, mathematically it doesn't really make sense to change the value of a variable. However, it is possible to define the equivalent of such a loop directly in lambda-calculus: through recursion. This requires some a bit weird definitions to use it in the original maths/logic setting, but can also be written in programming languages:

double sumFromTo_rec(int i, int iend, function<double(int)> f) {
  if (i<=iend)
    return f(i) + sumFromTo_rec(i+1, iend, f);
   else
    return 0;
}

Now, in the recursive call, i will have the value of i+1 from the outer call. So, am I doing mathematically unsound mutation here again? No actually! The i function argument is only an abstraction. In the inner function call, it's simply a different variable, which just happens to also be called i (but that's an irrelevant implementation detail).

You may well think this silly. If the computer can just re-use the same variable i by incrementing it, why should we bother with defining a new one? You'd be right in a way. sumFromTo_rec would in fact be inefficient in a language like C, because the computer would have to allocate an actual new integer on the stack in each function call.

However, quite practically speaking, mutating variables also opens up numerous possibilities for programming bugs. Functional programming language therefore eschew (or entirely forbid) this. These languages have come up with various optimisations like tail-calls to avoid the overhead of allocating new variables in every recursive call (so basically, you write clean mathematical semantics but under the hood the same memory is actually re-used, like it would be in an imperative language).

If you're a mathematically-interested programmer (or a computation-interested mathematician) I recommend you check out Haskell. It's a modern functional programming language with very clean semantics, yet great performance and good compatibility to “real world applications”. The above recursive summation would in Haskell look as simple as

sumFromTo i₀ iEnd f
    | i₀<=iEnd   = f i₀ + sumFromTo (i₀+1) iEnd f
    | otherwise  = 0

That's similar enough to how the rigorous recursive definition of $\sum$ in maths notation would look: $$ \sum_{i=i_0}^{i_{\text{end}}} f(i) = \begin{cases} f(i_0) + \sum_{i=i_0+1}^{i_{\text{end}}}f(i) & \text{if $i_0 < i_{\text{end}}$} \\ 0 & \text{otherwise} \end{cases} $$ Observe that $f(i)$ as such is never actually evaluated: it's always $f(i_0)$ (however, that variable has a different value in each recursion level). This is better reflected in a programming language than it is in the maths notation: the former really make a point of not applying $f$ to anything when you're talking about the function itself, rather then the result from applying that function to any particular argument.


*The classical standard notation in lambda calculus is actually $\lambda i.\ 1/{2^i}$, but I reckon the arrow makes it clearer what's going on than a dot. Most programming languages use an arrow symbol if they offer lambda functions, e.g. in Haskell you'd write

sumFromTo 0 6 (\i -> 1/2^i)

Unlike in maths, it's generally preferred in programming (except in Fortran and Matlab, which I don't consider very well-designed) to express the range with upper bound excluded. This turns out to be the most useful convention in practice; see what famous Dijkstra wrote on the matter.