[Math] Schönhage-Strassen multiplication

algorithmsarithmeticmodular arithmetic

I am trying to implement the Schönhage-Strassen algorithm (SSA) for multiplying large integers, but it only gives the right result if all $\delta_j$ are zero. I'll explain what I mean by this:

SSA (as described in the original 1971 paper Schnelle Multiplikation großer Zahlen) basically works like this:

  1. Let $a$ and $b$ be two integers to multiply.
  2. Determine $n$. Roughly speaking, $n$ will be about $\frac12 \log_2 \log_2 \max(a,b)$
  3. Compute integers $z_j$ such that $a\cdot b=\sum\limits_j z_j\cdot2^{j\cdot2^{n-1}}$, as follows:
    1. Compute $\xi_j\equiv z_j \pmod {2^{2^n}+1}$; this step performs an FFT+iFFT
    2. Compute $\eta_j\equiv z_j \pmod {2^{n+2}}$
    3. Compute $z_j$ from $\xi_j$ and $\eta_j$:
      1. $\delta_j\equiv\eta_j-\xi_j \pmod {2^{n+2}}$ <— THIS IS THE $\delta_j$ MENTIONED ABOVE
      2. $z_j=\xi_j+\delta_j(2^{2^n}+1)$
  4. Output $\sum\limits_j z_j\cdot2^{j\cdot2^{n-1}}$

I tested the other parts of the code, and I even did a a complete example with pencil and paper. Plus, everything works when $\delta=0$, so I'm pretty certain the problem is in step 3.3 or 4.

Wikipedia doesn't go into enough detail to even mention $\xi_j$ and $\eta_j$.

Does anyone know of a working example, sample code, or other information that might help me track down the problem?

Edit: I will put the code online if I get it to work.

Edit 2: I uploaded some Java code here. It runs as a regular Java program; the output shows one working example and one that doesn't work. Computation of $\delta,\xi,\eta,z_r$ happens in lines 148-160.

Also, there is a modified version here which uses much smaller input values, but I head to tweak the code so it handles pieces less than one byte.

Best Answer

I finally found the problem. It was a silly bug at the end of the modFn method where $1$ is added if there is a carry left over, meaning the result is negative and $F_n$ needs to be added. When adding the carry produced another carry, it was being dropped instead of being propagated to the next digit.

The reason why it looked like something to do with $\delta$, $\xi$, etc. was because I had verified FFT, iFFT, and computation of $z_j \mod 2^{n+2}$ worked correctly. The only step left after that is to calculate $z_j$ from $z_j \mod 2^{n+2}$ and $z_j \mod F_n$, but I was forgetting about the modFn between FFT and iFFT which is what caused the problem.