Clarification on the SAT problem

boolean-algebracomputational complexity

SAT is supposed to be in NP, which means that it can be verified in polynomial time in the number of inputs. I had some confusions regarding this.

Suppose we have $n$ Boolean variables $x_1, x_2 \cdots ,x_n$. The Boolean expression we will consider only uses one of these variables, say $x_1$, and it will be $\bigwedge_{j=1}^{2^n} x_1$. It is obvious that this is satisfiable, with $x_1 = \text{true}$ and the other variables arbitrary.

However, in order to verify that this assignment works, the algorithm would have to parse and simplify $\overbrace{\text{true} \land \text{true} \land \cdots \land \text{true}}^{2^n \ \text{times}}$. How can this be done, even in principle, in $o(2^n)$ time? I'm assuming the verifier would simplify $\text{true} \land \text{true}$ to $\text{true}$ at each step, which would still take $O(2^n)$ steps (this would be the naive approach, in any case). You might argue that the algorithm would "know" that $\bigwedge_{j=1}^{2^n} x_1$ can just be simplified to $x_1$. How would it know this, though? Perhaps for this specific example, there some tricks we can use (e.g., if there is only one variable and no negations in the entire expression, the algorithm simplifies the Boolean expression to just the single variable). However, this specific example is, well, really simple, and it was just to illustrate the more general point. How can we design a general verifier which can simplify Boolean expressions of potentially $\Omega(2^n)$ length to $O(n^k)$ (for some $k \in \mathbb{N}$) length, with this simplification done in polynomial time? Note that $n$ here is, again, the number of inputs $x_i$.

The specific issue here is that Boolean expressions in $n$ variables might be arbitrarily long, if we are allowed to repeat variables, which we certainly are in SAT. In general, it's unclear to me how a SAT verifier can verify that a particular truth assignment works in time polynomial in $n$ when the length of the Boolean expression might be exponential (or even greater) in $n$.

Further, another general confusion I seem to be having with this is that it's rarely specified what exact data structures are used in the inputs. SAT is usually discussed in a more theoretical contexts, so "concrete" details like data structures aren't mentioned. Would the Boolean expression simply be represented as a string? I suppose one issue is that switching mentally between "real world" algorithm analysis (binary trees, sorting algorithms, etc.) and more theoretical stuff (NP-completeness, SAT etc.) is a bit difficult, at least for me.

Best Answer

The relevant measure on the input is not the number of variables, but its size when represented as an input string of the Turing Machine (or whatever model of computation you are using). (The abstract definition of computational complexity involves no notion of what the input "means".)

Related Question