[Math] Subset sum problem is NP-complete

algorithmscomputational complexitycomputer sciencenp-complete

If I know correctly, subset sum problem is NP-complete. Here you have an array of n integers and you are given a target sum t, you have to return the numbers from the array which can sum up to the target (if possible).

But can't this problem be solved in polynomial time by dynamic programming method where we construct a table n X t and take cases like say last number is surely included in output and then the target becomes t- a[n]. In other case, the last number is not included, then the target remains same t but array becomes of size n-1. Hence this way we keep reducing size of problem.

If this approach is correct, isn't the complexity of this n * t, which is polynomial? and so if this belongs to P and also NP-complete (from what I hear) then P=NP.

Surely, I am missing something here. Where is the loophole in this reasoning?

Thanks,

Best Answer

If you express the inputs in unary you get a different running time than if you express them in a higher base (binary, most commonly).

So the question is, for subset sum, what base is appropriate? In computer science we normally default to the following:

  • If the input is a list or collection, we express its size as the number of items
  • If the input is an integer, we express its size as the number of bits (binary digits)

The intuition here is that we want to take the more "compact" representation.

So for subset sum, we have a list of size $n$ and a target integer of value $t$. Therefore it's common to express the input size as $n$ and $t=2^k$ where $k$ is the number of bits needed to express $t$. So the running time is $O(n 2^k)$ which is exponential in $k$.

But one could also say that $t$ is given in unary. Now the size of $t$ is $t$, and the running time is $O(n t)$, which is polynomial in $n$ and $t$.

In reductions involving subset sum (and other related problems like partition, 3-partition, etc) we must use a non-unary representation if we want to use it as an NP-Hard problem to reduce from.