[Math] Understanding Intel’s white paper algorithm for multiplication in $\text{GF}(2^n)$

algorithmscryptographyfinite-fieldspolynomials

I'm reading this Intel white paper on carry-less multiplication. For now, suppose I want to do multiplication in $\text{GF}(2^4)$. We are using the "usual" bitstring representation of polynomials here.

The algorithm for multiplying two numbers is given in Algorithm 3 on page 16. I tried executing the algorithm on paper for small inputs, but I'm getting the wrong answer. Below, I believe $t = 4$ and $s = 8$.

Let
$$A = x^3 + x^2 + x = [1110]$$ and
$$B = x^3 + x + 1 = [1011].$$ In $\text{GF}(2^4)$, the product of $A$ and $B$ is $x^3 = [1000]$, so this is the expected result of the algorithm. We can also compute the carry-less product of $A$ and $B$, which is

$$C = [0110 0010].$$

Our irreducible polynomial is $g = x^4 + x + 1 = [10011]$.

Now let us execute the preprocessing step, and then the three short steps of the algorithm. Note that below, $*$ denotes the carry-less multiplication (which should correspond to multiplication over $\text{GF}(2)$). I use a delimiter $|$ to improve readability of the bitstrings.

Preprocessing: The polynomial $g^*$ is of degree $t-1$ and consists of the $t$ least significant terms of $g$. So we have that $g^* = [0011]$. Then, $q^+$ will be the quotient of $x^{t+s} / g$. We have that the quotient of $x^{12} / g$ is $1 + 2x + x^2 – x^4 – x^5 + x^8$. This modulo $2$ is $1 + x^2 + x^4 + x^5 + x^8$, so we have that $q^+ = [1|0011|0101]$. See computation of $q^+$ on Wolfram Alpha.

Step 1: Compute $c * q^+$, which is $[0110] * q^+ = [0000|0110|1011|1110]$. See step 1 on Wolfram Alpha.

Step 2: The $s$ most significant terms of the polynomial resulting from step 1 are multiplied with $g^*$. So we compute $[0000|0110] * [0011] = [1010]$. See step 2 on Wolfram Alpha.

Step 3: The $t$ least significant bits of the previous polynomial is the result. So the answer is $[1010] = x^3 + x$.

This is incorrect, as the expected answer should be $x^3 = [1000]$. Where is my mistake?

Best Answer

What you have described so far is correct. But notice the discussion that is presented before the algorithm is given, which says "to reduce a 256-bit carry-less product modulo a polynomial g of degree 128, we first split it into two 128-bit halves. The least significant half is simply XOR-ed with the final remainder". In other words, you still need to XOR the lower half of your input with the final remainder.

In your current case, $C = [0110|0010]$, and you essentially only process the upper half of it, that is, your $c = [0110]$. After the third step, you obtain $[1010]$, which still needs to XOR'd with the lower half of $C$. So $[1010] \oplus [0010] = [1000]$, which represents $x^3$, and you are done.

Related Question