[Math] Are there any interesting examples of random NP-complete problems

computational complexity

Here's an example of the kind of thing I mean. Let's consider a random instance of 3-SAT, where you choose enough clauses for the formula to be almost certainly unsatisfiable, but not too many more than that. So now you have a smallish random formula that is unsatisfiable.

Given that formula, you can then ask, for any subset of its clauses, whether that subset gives you a satisfiable formula. That is a random (because it depends on the original random collection of clauses) problem in NP. It also looks as though it ought to be pretty hard. But proving that it is usually NP-complete also seems to be hard, because you don't have the usual freedom to simulate.

So my question is whether there are any results known that say that some randomized problem is NP-complete. (One can invent silly artificial examples like having a randomized part that has no effect on the problem — hence the word "interesting" in the question.)

Best Answer

Important update (Oct. 6, 2010): I'm pleased to say that I gave the "random 3SAT" problem in the OP to Allan Sly, and he came up with a simple NP-hardness proof. I've posted the proof to my blog with Allan's kind permission.


Sorry that I'm extraordinarily late to this discussion!

Regarding Tim's specific question: if we stick enough clauses in our instance that every 3SAT instance (satisfiable or not) occurs as a subformula with high probability, then certainly the resulting problem will be NP-hard. Indeed, it would suffice to have a large enough set of clauses that we can always find a subformula that can serve as the output of one of the standard reductions. On the other hand, I don't know of any techniques for proving NP-hardness that are tailored to the setting Tim has in mind -- it's a terrific question!

Since I can't answer the question he asked, let me talk for a while about a question he didn't ask (but that I, and apparently some of the commenters, initially thought he did). Namely, what's known about the general issue of whether there are NP-complete problems that one can show are NP-hard on average (under some natural distribution over instances)?

(1) The short answer is, it's been one of the great unsolved problems of theoretical computer science for the last 35 years! If there were an NP-complete problem that you could prove was as hard on average as it was on the worst case, that would be a huge step toward constructing a cryptosystem that was NP-hard to break, which is one the holy grails of cryptography.

(2) On the other hand, if you're willing go above NP-complete, we know that certain #P-complete problems (like the Permanent over large finite fields) have worst-case/average-case equivalence. That is to say, it's exactly as hard to compute the permanent of a uniform random matrix as it is to compute the permanent of any matrix, and this can be proven via an explicit (randomized) reduction.

(3) Likewise, if you're willing to go below NP-complete, then there are cryptographic problems, such as shortest lattice vector (mentioned by Rune) and discrete logarithm, that are known to have worst-case/average-case equivalence. Indeed, this property is directly related to why such problems are useful for cryptography in the first place! But alas, it's also related to why they're not believed to be NP-complete. Which brings me to...

(4) We do have some negative results, which suggest that worst-case/average-case equivalence for an NP-complete problem will require very new ideas if it's possible at all. (Harrison was alluding to these results, but he overstated the case a little.) In particular, Feigenbaum and Fortnow showed that, if there's an NP-complete problem that's worst-case/average-case equivalent under randomized, nonadaptive reductions, then the polynomial hierarchy collapses. (Their result was later strengthened by Bogdanov and Trevisan.) There are analogous negative results about basing crytographic one-way functions on an NP-complete problem: for example, Akavia, Goldreich, Goldwasser, and Moshkovitz (erratum). At present, though, none of these results rule out the possibility of a problem being NP-complete on average under the most general kind of reductions: namely, randomized adaptive reductions (where you can decide what to feed the oracle based on its answers to the previous queries).

(5) Everything I said above implicitly assumed that, when we say we want an average-case NP-complete problem, we mean with a distribution over instances that's efficiently samplable. (For example, 3SAT with randomly generated clauses would satisfy that condition, as would almost anything else that arises naturally.) If you allow any distribution at all, then there's a "cheating" way to get average-case NP-completeness. This is Levin's universal distribution U, where each instance x occurs with probability proportional to $2^{-K(x)}$, K(x) being the Kolmogorov complexity of x. In particular, for any fixed polynomial-time Turing machine M, the lexicographically-first instance on which M fails will have a short description (I just gave it), and will therefore occur in U with large probability!

(6) If you're willing to fix a polynomial-time algorithm M, then there's a beautiful result of Gutfreund, Shaltiel, and Ta-Shma that gives an efficiently-samplable distribution over NP-complete problem instances that's hard for M, assuming $NP \nsubseteq BPP$. The basic idea here is simple and surprising: you feed M its own code as input, and ask it to find you an instance on which it itself fails! If it succeeds, then M itself acts as your sampler of hard instances, while if it fails, then the instance you just gave it was the hard instance you wanted!

(7) Finally, what about "natural" distributions, like the uniform distribution over all 3SAT instances with n clauses and m~4.3n variables? For those, alas, we generally don't have any formal hardness results.

Related Question