Using Kolmogorov Complexity to Measure Problem Difficulty

algorithmskolmogorov-complexitynpnt.number-theoryreference-request

We call the natural number $n$ a partition number $\iff$
$$
\exists d | n: \gcd\left(d,\frac{n}{d}\right)=1 \text{ and } \Omega(d) = \Omega\left(\frac{n}{d}\right)\;,
$$

where $\Omega$ counts the prime divisors with multiplicities.

Then we have:
$n= p_1^{a_1} \cdots p_r^{a_r}$ is a partition number $\iff S = (a_1,\cdots,a_r)$ has a solution to the partition problem which is known to be NP-complete.

I wanted to test the following heuristic idea:

1.) Space $\equiv$ Time. (for computations)

2.) Most strings are incompressible, because only finitely many space below.

$\rightarrow$ Most problems should not be solved faster (in time) than some boundary determined by the Kolmogorov complexity (=space).

To make this idea more precise I decided to test it for the following three problems:

Determine if a number $n$ is even. Determine if a number $n$ is prime. Determine if a number $n$ is a partition number.

For each of this problem I wrote a computer program to output a bit $=1$ if the number $n$ is even/prime/partition number, or $0$ otherwise.

Intuitively, if it was easy or if there was a pattern to these bits, then it should be exploited by an algorithm to be fast on this problem. To measure how easy these problems are, I suggest to use the Kolmogorov complexity or a zip program. We get in increasing order:

Result:

Even numbers are compressible to nearly 100%.

Prime numbers are compressible to about 59%.

Partition numbers are compressible to about 4%.

(Random bits are compressible to about 0%.)

Here is some Sagemath code which does the computation. I have zipped the files by hand afterwards and compared the file size.

def Omega(n):
    return sum([valuation(n,p) for p in prime_divisors(n)])

def omega(n):
    return len(prime_divisors(n))

def checkIfPartitionNumber(x):
    return any([Omega(d)==Omega(x/d) for d in divisors(x) if gcd(d,x/d)==1])

def isPrime(x):
    return is_prime(x)

def getWordUpTo(n,func=checkIfPartitionNumber):
    s = ''
    for x in range(1,n+1):
        if func(x):
            s+="1"
        else:
            s+="0"
    return s        

def toFile(n,func=isPrime,fname="is_prime.bin"):
    from bitarray import bitarray
    ba = bitarray(getWordUpTo(n,func))
    print(ba)
    #rint(dir(ba))
    with open(fname,"wb") as f:
        ba.tofile(f)
        
toFile(n=100000,func=lambda x: x%2, fname="is_even.bin")        
toFile(n=100000,func=is_prime, fname="is_prime.bin")        
toFile(n=100000,func=checkIfPartitionNumber, fname="is_partition_number.bin")        
toFile(n=100000,func=lambda x: randint(0,1), fname="random_bits.bin")        

I am interested if this idea is being pursued in the literature and if so a reference would be nice to share. Or maybe this idea is not pursued because you can think of some problem which violates the hypothesis of this heuristic? (That would also be nice to share.) Or maybe you would like me to test a problem you can think of if it fits the hypothesis of this question?

Update:
What does "Space $\equiv$ Time" mean?

What I try to do is to measure the "search space" of an algorithm considered and if the resulting string can be compressed by gzip or zip. For instance, in each time step $t$ we have for a Turing machine a configuration $c_t$, which we could encode as a string $s_t$ and write down all those strings $s_1,\cdots,s_T$ to a different tape. Later these "string of configurations" s:=$s_1 \cdot s_2 \cdots \cdot s_T$ (concatenated) , will be compressed with gzip or zip to give a compressed string $s_c$. Intuitively, if $s$ it is strongly compressible $|s_c|<<|s|$, then (?) it means that the algorithm / the Turing machine could be improved, because it considers repetetive configurations. If the string $s$ appears random because $|s_c| \equiv |s|$, then intuitively there can not be much done to improve the runtime of the algorithm / Turing machine on this input.
Since this is just a heuristic it is difficult to get more formal or concrete. I hope this clarifies the idea. (It is slightly different from the implementation described above, but here you can find some SAGEMATH implementation of this idea.)

Best Answer

It is worth mentioning that variants of Kolmogorov complexity have been investigated in the context of the study of pseudorandomness. See On One-Way Functions from NP-Complete Problems by Yanyi Liu and Rafael Pass (arXiv link). It is probably worth mentioning as well that Quanta magazine has written a popular science article on the paper.

Related Question