[Math] Satisfiability of general Boolean formulas with at most two occurrences per variable

algorithmscomputational complexitycomputer sciencelo.logicnp

(If you know basics in theoretical computer science, you may skip immediately to the dark box below. I thought I would try to explain my question very carefully, to maximize the number of people that understand it.)

We say that a Boolean formula is a propositional formula over some 0-1 variables $x_1,\ldots,x_n$ involving AND and OR connectives, where the atoms are literals which are instances of either $x_i$ or $\neg x_i$ for some $i$. That is, a Boolean formula is a "monotone" formula over the $2n$ atoms $x_1,\ldots, x_n, \neg x_1,\ldots, \neg x_n$. For example, $$((\neg x_1 ~OR~ x_2) ~AND~ (\neg x_2 ~OR~ x_1)) ~OR~ x_3$$ is a Boolean formula expressing that either $x_1$ and $x_2$ take on the same value, or $x_3$ must be $1$. We say that a 0-1 assignment to the variables of a formula is satisfying if the formula evaluates to $1$ on the assignment.

The Cook-Levin theorem says that the general satisfiability of Boolean formulas problem is $NP$-complete: given an arbitrary formula, it is $NP$-hard to find a satisfying assignment for it. In fact, even satisfying Boolean formulas where each variable $x_i$ appears at most three times in the formula is $NP$-complete. (Here is a reduction from the general case to this case: Suppose a variable $x$ appears $k > 3$ times. Replace each of its occurrences with fresh new variables $x^1, x^2, \ldots, x^k$, and constrain these $k$ variables to all take on the same value, by ANDing the formula with $$(\neg x^1 ~OR~ x^2) ~AND~ (\neg x^2 ~OR~ x^3) ~AND~ \cdots ~AND~ (\neg x^{k-1} ~OR~ x^k) ~AND~ (\neg x^k ~OR~ x^1).$$ The total number of occurrences of each variable $x^j$ is now three.) On the other hand, if each variable appears only once in the formula, then the satisfiability algorithm is very easy: since we have a monotone formula in $x_1,\ldots, x_n, \neg x_1,\ldots, \neg x_n$, we set $x_i$ to $0$ if $\neg x_i$ appears, otherwise we set $x_i$ to $1$. If this assignment does not get the formula to output $1$, then no assignment will.

My question is, suppose every variable appears at most twice in a general Boolean formula. Is the satisfiability problem for this class of formulas $NP$-complete, or solvable in polynomial time?

EDIT: To clarify further, here is an example instance of the problem:
$$((x_1 ~AND~ x_3) ~OR~ (x_2 ~AND~ x_4 ~AND~ x_5)) ~AND~ (\neg x_1 ~OR~ \neg x_4) ~AND~ (\neg x_2 ~OR~ (\neg x_3 ~AND~ \neg x_5))$$

Note that when we restrict the class of formulas further to conjunctive normal form (i.e. a depth-2 circuit, an AND of ORs of literals) then this "at most twice" problem is known to be solvable in polynomial time. In fact, applying the "resolution rule" repeatedly will work. But it is not clear (at least, not to me) how to extend resolution for the class of general formulas to get a polytime algorithm. Note when we reduce a formula to conjunctive normal form in the usual way, this reduction introduces variables with three occurrences. So it seems plausible that perhaps one might be able to encode an $NP$-complete problem in the additional structure provided by a formula, even one with only two occurrences per variable.

My guess is that the problem is polynomial time solvable. I'm very surprised that I could not find a reference to this problem in the literature. Perhaps I'm just not looking in the right places.

UPDATE: Please think about the problem before looking below. The answer is surprisingly simple.

Best Answer

A theorem in a paper of Peter Heusch, "The Complexity of the Falsifiability Problem for Pure Implicational Formulas" (MFCS'95), seems to suggest the problem is NP-hard. I repeat the first part of its proof here:

By reduction from the restricted version of 3SAT where every variable occurs at most 3 times. Given such a CNF, WOLOG every variable with 3 occurrences occurs once positively and twice negatively. If $x$ is such a variable, let $C_1$, $C_2$ be the clauses such that $C_1 = \neg x \vee C_1'$ and $C_2 = \neg x \vee C_2'$. Introduce new variables $x'$ and $x''$ and replace $C_1$ and $C_2$ by $\neg x \vee (x' \wedge x'')$, $\neg x' \vee C_1'$, $\neg x'' \vee C_2'$. Repeat.

Related Question