[Math] Clarification about a double summation found in the book “Concrete Mathematics”

discrete mathematicssummation

I think this question may be viewed as too simplistic, or even dumb with respect to the other types of questions asked on this site.

In chapter 2 section 4 (multiple sums) of Concrete Mathematics(Graham,Knuth,Patashnik) the authors seem to have 2 different ways to solve a double sum, but interchange their methods from question to question, so it gets kind of difficult to see what they're doing.

In particular, they start off with a question like this:

\begin{equation}
S = \displaystyle\sum\limits_{1 \le j < k \le n}^{}{(a_k – a_j)(b_k – b_j)}
\end{equation}

And simply solve it by changing the index variables to get:
\begin{equation}
S = \displaystyle\sum\limits_{1 \le k < j \le n}^{}{(a_j – a_k)(b_j – b_k)}
\end{equation}

Then factor out a $-1$ to get the original summation but written differently.

Then they go on to add S to itself using the Iverson identity:
$[1 \le j < k \le n] + [1 \le k < j \le n] = [1 \le j, k \le n] – [1 \le j = k \le n]$

And solve each sum accordingly.

However when faced with a different summation, they re-write it and sum 1st on some index variable. My confusion comes from the fact that they just interchange this method of solving double sums, without stating the change.

How would one know to solve a double sum by changing the order of summation or by finding a new Iverson identity?

Thanks,

Best Answer

They didn’t just interchange the index variables in that first example: they exploited the fact that doing so does not change the sum. Thus, they were able to write

$$2S=\sum_{1\le j<k\le n}(a_k-a_j)(b_k-b_j)+\sum_{1\le k<j\le n}(a_j-a_k)(b_j-b_k)\;,$$

getting a sum in which every possible term of the form $(a_i-a_\ell)(b_i-b_\ell)$ with $1\le i,\ell\le n$ appears except those in which $i=\ell$.

It may help to think of this in matrix terms. For $1\le j,k\le n$ let $c_{j,k}=(a_k-a_j)(b_k-b_j)$, and let

$$C=\begin{bmatrix}c_{1,1}&\color{red}{c_{1,2}}&\color{red}{\ldots}&\color{red}{c_{1,n-1}}&\color{red}{c_{1,n}}\\ \color{blue}{c_{2,1}}&c_{2,2}&\color{red}{\ldots}&\color{red}{c_{2,n-1}}&\color{red}{c_{2,n}}\\ \color{blue}{\vdots}&\color{blue}{\vdots}&\ddots&\color{red}{\vdots}&\color{red}{\vdots}\\ \color{blue}{c_{n-1,1}}&\color{blue}{c_{n-1,2}}&\color{blue}{\ldots}&c_{n-1,n-1}&\color{red}{c_{n-1,n}}\\ \color{blue}{c_{n,1}}&\color{blue}{c_{n,2}}&\color{blue}{\ldots}&\color{blue}{c_{n,n-1}}&c_{n,n} \end{bmatrix}\;;$$

then $S$ is the sum of the (red) entries above the main diagonal of $C$, and the sum with the indices interchanged is the sum of the (blue) entries below the main diagonal of $C$. In this very special case the matrix $C$ happens to be symmetric, so the two sums are equal. Nicer yet, the entries on the main diagonal are all $0$, so the sum of all the entries in $C$ is $2S$. Thus, $2S$ is just the sum of all possible products of the form $(a_k-a_j)(b_k-b_j)$ with $1\le j,k\le n$, which is very easy to compute after we rewrite $(a_k-a_j)(b_k-b_j)$ as $a_kb_k-a_kb_j-a_jb_k+a_jb_j$.

What makes this work is the symmetry of the matrix $C$: this is one of the ‘important special cases’ mentioned in the sentence in the middle of page $36$. Learning to recognize them is to a great extent a matter of experience. In general, though, the first thing to do is see whether you can simply evaluate the summations in order as they’re written; if that doesn’t look promising, the next step is to see whether you can reverse the order of summation and get something nicer. Neither of these approaches looks very attractive in the problem above, so at that point you look for some other idea. Recognizing that a summation over $1\le j<k\le n$ or $1\le j\le k\le n$ is a summation over the upper half (give or take the diagonal) of an $n\times n$ matrix, you can reasonably look to see whether the whole matrix would be easier to work with and whether there’s an exploitable relationship between the upper and lower halves. Here both of those turn out to be the case.

Related Question