Is this proof of Barendregt’s Substitution Lemma (lambda calculus) correct

computer scienceinductionlambda-calculusproof-verification

So I have attempted to prove the lemma mentioned in the title, but I am not sure if it is correct. In the book I am reading, the Lemma was given without proof. We were given the following definitions:

Definition The set $\Lambda$ of all $\lambda$-terms

  1. If $u \in V$, then $u \in \Lambda$

  2. If $M$ and $N \in \Lambda$, then $(MN) \in \Lambda$

  3. If $u \in V$ and $M \in \Lambda$, then $(\lambda u . M) \in \Lambda$

Alternatively, via abstract syntax

$$\Lambda = V \mid (\Lambda \Lambda) \mid (\lambda V . \Lambda)$$

and

Definition Substitution into $\lambda$-terms

  1. $x[x := N] \equiv N$

  2. $y[x := N] \equiv y$ if $x \not\equiv y$

  3. $(PQ)[x := N] \equiv (P[x := N])(Q[x := N])$

  4. $(\lambda y . P)[x := N] \equiv \lambda z . (P^{y \to z}[x := N])$
    if $\lambda z . P^{y \to z}$ is an $\alpha$-variant of
    $\lambda y . P$ such that $z \not\in FV(N)$

Here is my attempted proof:

Lemma Let $x \not\equiv y$ and assume $x \not\in FV(L).$ Then: $$M[x:=N][y:=L] \equiv M[y := L][x := N[y := L]]$$

Proof. $M$ is a $\lambda$-term by definition. There are only three possible cases:

  1. $M$ is a variable $z$.

    1. If $z \equiv x$ then \begin{aligned}
      M[x := N][y := L] &=\\\\
      (x[x := N])[y := L] &=\\\\
      N[y := L]
      \end{aligned}
      and \begin{aligned}
      M[y := L][x := N[y := L]] &=\\\\
      (x[y := L])[x := N[y := L]] &=\\\\
      x[x := N[y := L]] &=\\\\
      N[y := L]
      \end{aligned}

    2. if $z \equiv y$ then \begin{aligned}
      M[x := N][y := L] &=\\\\
      (y[x := N])[y := L] &=\\\\
      y[y := L] &=\\\\
      L
      \end{aligned}
      and \begin{aligned}
      M[y := L][x := N[y := L]] &=&\\\\
      (y[y := L])[x := N[y := L]] &=& \\\\
      L[x := N[y := L]] &=& \\\\
      L &&\text{(since, per the Lemma, $x \not\in FV(L)$}
      \end{aligned}

    3. if $z \not\equiv x$ and $z \not\equiv y$ then trivially, both
      sides of the equation substitute to $z$.

  2. $M \equiv (\lambda y . P)$ for some $P \in \Lambda$ In this case
    \begin{aligned}
    M[x := N][y := L] &=&\\\\
    (\lambda y . P)[x := N][y := L] &=&\\\\
    (\lambda u . P^{y \to u} [x := N])[y := L] &=&\\\\
    (\lambda u . P^{y \to u} [x := N][y := L]) &,& u \not\in FV(N) \cup FV(L)
    \end{aligned}
    and similarly \begin{aligned}
    M[y := L][x := N[y := L]] &=&\\\\
    (\lambda y . P)[y := L][x := N[y := L]] &=&\\\\
    (\lambda u . P^{y \to u} [y := L][x := N[y := L]]) &,& u \not\in FV(N) \cup FV(L)
    \end{aligned}

    Therefore, if
    $$P^{y \to u} [x := N][y := L] \equiv P^{y \to u} [y := L][x := N[y := L]]$$
    then
    $$(\lambda y . P)[x := N][y := L] \equiv (\lambda y . P) [y := L][x := N[y := L]]$$

  3. $M \equiv P Q$ for some $P, Q \in \Lambda$ In this case
    $$M[x := N][y := L] = (P[x := N][y := L])(Q[x := N][y := L])$$ and
    $$\begin{gathered}
    M[y := L][x := N[y := L]] =\\\\ (P[y := L][x := N[y := L]])(Q[y := L][x := N[y := L]])
    \end{gathered}$$

    Therefore if $$P[x := N][y := L] = P[y := L][x := N[y := L]]$$ and
    $$Q[x := N][y := L] = Q[y := L][x := N[y := L]]$$ then
    $$PQ[x := N][y := L] = PQ[y := L][x := N[y := L]]$$

Using case $1$ as the base case, we've proved the lemma by induction
using $2$ and $3$

Something doesn't feel right about this. I feel as though I didn't link the first case with the second two cases strongly enough to be able to invoke induction, or did I? I would also appreciate any advice on how to make the proof better.

Best Answer

Your proof is fine. Maybe it's a little too verbose, but it's just a matter of style, not of correctness.

The point is that the proof is by structural induction on the term $M$ (in general, this is written at the beginning of the proof, not at the end).

The base case is when $M$ is a variable, and for the base case you don't need the induction hypothesis, as usual.

In the application ($M \equiv PQ$) and abstraction ($M \equiv \lambda x.P$) cases, you need the induction hypothesis (that is, $P [x := N] [y := L] = P[y := L][x := N[y := L]]$ and, for the application case, $Q [x := N] [y := L] = Q[y := L][x := N[y := L]]$ too) to conclude.

The link between the variable case and the abstraction or application case is implicit (as usual in a proof by structural induction), you don't have to write anything else. Where is this implicit link? Think of the "simplest" application. It is an application of the form $M \equiv z_1z_2$ (possibly $z_1 \equiv z_2$). So, according to your (correct) proof, in the application case the induction hypothesis says that $z_1 [x := N] [y := L] = z_1[y := L][x := N[y := L]]$ and $z_2 [x := N] [y := L] = z_2 [y := L][x := N[y := L]]$. We know from the base case that actually the induction hypothesis is true, hence you know that $(z_1z_2) [x := N] [y := L] = (z_1z_2)[y := L][x := N[y := L]]$ is true, and now you can go on with more complex terms (an analogous argument holds for the "simplest" abstraction, $M = \lambda z_1.z_2$).

Related Question