Definition of $coNP$

computational complexity

I have this question about the definition of $coNP$ in our lecture that I simply cannot wrap my head around. We define
$$ coNP := \{L \subset \{0,1\}^* \mid \bar{L} \in NP\},$$
and then later on we give as an example $\bar{SAT} = \{\varphi \mid \varphi \text{ is unsatisfiable}\}$. Now this makes no sense to me:

Firstly if we look at $\{0,1\}^*$ then couldn't the same binary string mean two completely different things depending on the semantics we choose to encode the instances? For examples $01$ could encode an empty graph or a graph with a single vertex or whatsoever). Also couldn't many binary strings lead to the same instance depending on semantics? for example by saying $0 = 00$.

Secondly, specifically for $\bar{SAT}$, why is it exactly the unsatisfiable formulas given the definition? Shouldn't the complement of $SAT$ when seen as a subset of $\{0,1\}^*$ contain a lot of garbage that could not even be interpreted as formulas?

Somehow it seems that the intuition for $coNP$ as used for $\bar{SAT}$ just does not match its definition, which should be crucial for a better understanding of the class?

Maybe I'm getting some things wrong here so I'm glad if you could lead me out of this confusion.

Best Answer

For you first point, in the definition of $\mathrm{co}\mathrm{NP}$, $L$ is just a set of strings of $0$s and $1$s - it doesn't matter what a string "encodes".

For your second point, you are right that $\overline{\mathrm SAT}$ viewed as a set of sequences of symbols contains a lot of syntactic junk. However, it takes only polynomial time to parse the sequences and identify the junk, so it doesn't make any difference to questions about whether the original problem is in $P$ or $NP$.

I am afraid I don't understand your last point about intuitions about $\mathrm{co}\mathrm{NP}$ and $\mathrm{SAT}$.