Using the extended euclidean algorithm to crack LFSR

cryptographyeuclidean-algorithmpolynomials

Question

I am having trouble cracking an LFSR using the Extended Euclidean Algorithm (EEA).
The problem comes as follow, let say we have the following LFSR :

ENCODED = 01100 00000 10000 00000 01110 00000 11001 00010

We know that the plain text starts with : PLAIN_START = 01111 00000 10101.

How can I use EEA to find the polynomial that generate the LFSR ? The purpose being retrieving all the plain bits from the encoded ones.

NB Some context: Initially ENCODED represents the encoded string "L P N Y B". PLAIN_START represents "O U" the first two letter of the plain text before encoding with LFSR. Also for your information 00000 represents the spaces.

Attempt

What I know is that I should start by finding the length of the key (aka number of register) $n$. Then I should look for the bits until $2n$, then construct a polynomial $M(X)$ using those bits and then apply EEA between $M(X)$ and $X^{2n}$ until the degree of the remainder is less than $n$.

However I can't seem to be able to find $n$. I know it has to do with the plain text otherwise the plain text would be useless. I don't know if that is the general method and the best method to crack an LFSR, but I would appreciate any help.

Best Answer

This is based on my interpretation of your question.

You say, let say we have the following encoded ciphertext:

ENCODED = 01100 00000 10000 00000 01110 00000 11001 00010

We know that the plain text starts with :

PLAIN_START = 01111 00000 10101.

The standard way to do LFSR encryption is by modulo two addition of the keystream (LFSR output) to the plaintext. So adding the known plaintext to the encoded stream gives a candidate keystream (LFSR Output):

CIPHERTEXT = 01100 00000 10000

PLAINTEXT = 01111 00000 10101

KEYSTREAM = 00011 00000 00101

This means that for this problem to be at all solvable the length $n$ of the LFSR must be 15 or less.

Then you can use Berlekamp Massey to solve for the full LFSR output and then add that output to the keystream to get the rest of the plaintext.

I see another twist in the question, however. The keystream we obtained under the standard LFSR cipher mode has 7 zeroes in a row. This means that the LFSR must have length at least 8, since the all zero loading results in an all zero (trivial) output.

So apply Berlekamp Massey and then use the deduced keystream, subtract from the ciphertext and see if you get a meaningful encoding of plaintext under the alphabet you were given.

Related Question