Is this proof of Barendregt’s Substitution Lemma (lambda calculus) correct

So I have attempted to prove the lemma mentioned in the title, but I am not sure if it is correct. In the book I am reading, the Lemma was given without proof. We were given the following definitions:

Definition The set $\Lambda$ of all $\lambda$-terms

If $u \in V$, then $u \in \Lambda$

If $M$ and $N \in \Lambda$, then $(MN) \in \Lambda$

If $u \in V$ and $M \in \Lambda$, then $(\lambda u . M) \in \Lambda$

Alternatively, via abstract syntax

$$\Lambda = V \mid (\Lambda \Lambda) \mid (\lambda V . \Lambda)$$

and

Definition Substitution into $\lambda$-terms

$x[x := N] \equiv N$

$y[x := N] \equiv y$ if $x \not\equiv y$

$(PQ)[x := N] \equiv (P[x := N])(Q[x := N])$

$(\lambda y . P)[x := N] \equiv \lambda z . (P^{y \to z}[x := N])$
if $\lambda z . P^{y \to z}$ is an $\alpha$-variant of
$\lambda y . P$ such that $z \not\in FV(N)$

Here is my attempted proof:

Lemma Let $x \not\equiv y$ and assume $x \not\in FV(L).$ Then: $$M[x:=N][y:=L] \equiv M[y := L][x := N[y := L]]$$

Proof. $M$ is a $\lambda$-term by definition. There are only three possible cases:

$M$ is a variable $z$.

If $z \equiv x$ then \begin{aligned}
M[x := N][y := L] &=\\\\
(x[x := N])[y := L] &=\\\\
N[y := L]
\end{aligned} and \begin{aligned}
M[y := L][x := N[y := L]] &=\\\\
(x[y := L])[x := N[y := L]] &=\\\\
x[x := N[y := L]] &=\\\\
N[y := L]
\end{aligned}

if $z \equiv y$ then \begin{aligned}
M[x := N][y := L] &=\\\\
(y[x := N])[y := L] &=\\\\
y[y := L] &=\\\\
L
\end{aligned} and \begin{aligned}
M[y := L][x := N[y := L]] &=&\\\\
(y[y := L])[x := N[y := L]] &=& \\\\
L[x := N[y := L]] &=& \\\\
L &&\text{(since, per the Lemma, $x \not\in FV(L)$}
\end{aligned}

if $z \not\equiv x$ and $z \not\equiv y$ then trivially, both
sides of the equation substitute to $z$.

$M \equiv (\lambda y . P)$ for some $P \in \Lambda$ In this case
\begin{aligned}
M[x := N][y := L] &=&\\\\
(\lambda y . P)[x := N][y := L] &=&\\\\
(\lambda u . P^{y \to u} [x := N])[y := L] &=&\\\\
(\lambda u . P^{y \to u} [x := N][y := L]) &,& u \not\in FV(N) \cup FV(L)
\end{aligned} and similarly \begin{aligned}
M[y := L][x := N[y := L]] &=&\\\\
(\lambda y . P)[y := L][x := N[y := L]] &=&\\\\
(\lambda u . P^{y \to u} [y := L][x := N[y := L]]) &,& u \not\in FV(N) \cup FV(L)
\end{aligned}

Therefore, if
$$P^{y \to u} [x := N][y := L] \equiv P^{y \to u} [y := L][x := N[y := L]]$$
then
$$(\lambda y . P)[x := N][y := L] \equiv (\lambda y . P) [y := L][x := N[y := L]]$$

$M \equiv P Q$ for some $P, Q \in \Lambda$ In this case
$$M[x := N][y := L] = (P[x := N][y := L])(Q[x := N][y := L])$$ and
$$\begin{gathered}
M[y := L][x := N[y := L]] =\\\\ (P[y := L][x := N[y := L]])(Q[y := L][x := N[y := L]])
\end{gathered}$$

Therefore if $$P[x := N][y := L] = P[y := L][x := N[y := L]]$$ and
$$Q[x := N][y := L] = Q[y := L][x := N[y := L]]$$ then
$$PQ[x := N][y := L] = PQ[y := L][x := N[y := L]]$$

Using case $1$ as the base case, we've proved the lemma by induction
using $2$ and $3$

Something doesn't feel right about this. I feel as though I didn't link the first case with the second two cases strongly enough to be able to invoke induction, or did I? I would also appreciate any advice on how to make the proof better.

Best Answer

Your proof is fine. Maybe it's a little too verbose, but it's just a matter of style, not of correctness.

The point is that the proof is by structural induction on the term $M$ (in general, this is written at the beginning of the proof, not at the end).

The base case is when $M$ is a variable, and for the base case you don't need the induction hypothesis, as usual.

In the application ($M \equiv PQ$) and abstraction ($M \equiv \lambda x.P$) cases, you need the induction hypothesis (that is, $P [x := N] [y := L] = P[y := L][x := N[y := L]]$ and, for the application case, $Q [x := N] [y := L] = Q[y := L][x := N[y := L]]$ too) to conclude.

The link between the variable case and the abstraction or application case is implicit (as usual in a proof by structural induction), you don't have to write anything else. Where is this implicit link? Think of the "simplest" application. It is an application of the form $M \equiv z_1z_2$ (possibly $z_1 \equiv z_2$). So, according to your (correct) proof, in the application case the induction hypothesis says that $z_1 [x := N] [y := L] = z_1[y := L][x := N[y := L]]$ and $z_2 [x := N] [y := L] = z_2 [y := L][x := N[y := L]]$. We know from the base case that actually the induction hypothesis is true, hence you know that $(z_1z_2) [x := N] [y := L] = (z_1z_2)[y := L][x := N[y := L]]$ is true, and now you can go on with more complex terms (an analogous argument holds for the "simplest" abstraction, $M = \lambda z_1.z_2$).

Best Answer

Related Solutions

[Math] Barendregt’s Substitution Lemma (lambda calculus)

Structure of proof by induction in lambda calculus

Related Question