Since the augmented filtration is right-continuous, we may assume that $(M_t)_{t \geq 0}$ has càdlàg sample paths. Since $(M_t)_{t \geq 0}$ is a local martingale, there is a sequence of stopping times $(\tau_k)$ such that $\tau_k \uparrow \infty$ and $(M_{t \wedge \tau_k})_{t \geq 0}$ is a martingale. Set
$$Y := M_{T \wedge \tau_k}$$
for fixed $k \in \mathbb{N}$ and $T>0$. Since $Y \in L^1$ and $L^2$ is dense in $L^1$, we can find a sequence $(Y_n)_{n \in \mathbb{N}} \subseteq L^2(\mathcal{F}_T)$ such that $Y_n \to Y$ in $L^1$. Obviously,
$$M_t^n := \mathbb{E}(Y_n \mid \mathcal{F}_t), \qquad t \leq T,$$
is an $L^2$-bounded $\mathcal{F}_t$-martingale. By the martingale representation theorem for $L^2$-martingales, there exists $\phi_n \in L^2(\lambda_T \otimes \mathbb{P})$ such that
$$M_t^n = \int_0^t \phi_n(s) \, dW_s, \qquad t \leq T.$$
In particular, $(M_t^n)_{t \leq T}$ has continuous sample paths. By the maximal inequality,
$$\mathbb{P} \left( \sup_{t \leq T} |M_{t \wedge \tau_k}-M_t^n| > \epsilon \right) \leq \epsilon^{-1} \mathbb{E}|Y-Y_n| \to 0,$$
i.e. $\sup_{t \leq T} |M_{t \wedge \tau_k}-M_t^n|$ converges in probability to $0$. Extracting a convergent subsequence, we conclude that $(M_{t \wedge \tau_k})_{t \leq T}$ has continuous sample paths. Since both $k$ and $T$ are arbitrary, we find that $(M_t)_{t \geq 0}$ has a.s. continuous sample paths. Now the claim follows using the argumentation described in the question.
For martingales with not necessarily continuous sample paths (which are not adapted to a filtration generated by a Brownian motion), we need more general representation results; the following result is due to Ikeda-Watanabe.
Let $(M_t)_{t \geq 0} \in \mathcal{M}_2$ a martingale with respect to a filtration $(\mathcal{F}_t)_{t \geq 0}$ generated by a Lévy process. Then there exists predictable processes $f,g$ as well as a Brownian motion $(W_t)_{t \geq 0}$ and a Poisson random measure $N$ such that $$M_t - M_0 = \int_0^t f(s) \, dW_s + \int_0^t g(s) \, d\tilde{N}_s$$ where $\tilde{N}$ denotes the compensated Poisson random measure.
See also this question.
For the diagonalization: Let's consider the diffusion coefficient $\sigma$ (the reasoning for the drift is analogous). Since $\sigma 1_{[0,\tau_n)} \in \mathcal{L}^2_T$ there exists for each a simple process $g_n$ such that
$$\|g_n- \sigma 1_{[0,\tau_n)} \|_{L^2} \leq \frac{1}{n}.\tag{1}$$
Claim: $g_n 1_{[0,\tau_k)} \to \sigma 1_{[0,\tau_k)}$ in $L^2$ for each $k \geq 1$.
Proof: For each $n \geq k$ we have
\begin{align*}\|g_n 1_{[0,\tau_k)} - \sigma 1_{[0,\tau_k)}\|_{L^2}^2 &= \mathbb{E} \int_0^{\tau_k} |g_n(s,\omega)-\sigma(s,\omega)|^2 \, ds \, d\mathbb{P}(\omega) \\ &\leq \mathbb{E} \int_0^{\tau_n} |g_n(s,\omega)-\sigma(s,\omega)|^2 \, ds \, d\mathbb{P}(\omega) \\ &\leq \mathbb{E} \int_0^{\tau_n} |g_n(s,\omega)-\sigma(s,\omega)|^2 \, ds \, d\mathbb{P}(\omega) \\ &\quad +\mathbb{E} \int_{\tau_n}^{\infty} |g_n(s,\omega)-0|^2 \, ds \, d\mathbb{P}(\omega) \\ &= \|g_n- \sigma 1_{[0,\tau_n)}\|_{L^2}^2 \end{align*}
and so, by $(1)$,
$$\|g_n 1_{[0,\tau_k)} - \sigma 1_{[0,\tau_k)}\|_{L^2} \leq \frac{1}{n},$$ which proves the assertion. Consequently, $(g_n)_{n \in \mathbb{N}}$ is the sequence of simple functions which we are looking for.
Regarding your question about the estimate for the drift: Yes, you need to apply Jensen's inequality. Note that, by Jensen's inequality,
$$\left( \int_0^t f(s) \, ds \right)^2 \leq t \int_0^t f(s)^2 \, ds \tag{2}$$
for each $t \geq 0$ and any (suitable integrable) function $f$. This gives
\begin{align*} \left| \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)| \, ds \right|^2 &\leq (T \wedge \tau_n) \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)|^2 \, ds \\ &\leq T \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)|^2 \, ds. \tag{3}\end{align*}
Taking expectation we get
\begin{align*} \mathbb{E}\left(\left| \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)| \, ds\right|^2 \right)\leq T \mathbb{E}\int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)|^2 \, ds,\end{align*}
and by construction the right-hand side converges to $0$ as $|\Pi| \to 0$. Hence, by Markov's inequality,
\begin{align*} \mathbb{P} \left( \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)| \, ds > \epsilon \right) &\leq \frac{1}{\epsilon^2}\mathbb{E}\left(\left| \int_0^{T \wedge \tau_n} |b^{\Pi}(s)-b(s)| \, ds \right|^2 \right) \\ &\xrightarrow[]{|\Pi| \to 0} 0. \end{align*}
Best Answer
According to this Q&A, Ito isometry holds for $\mathcal{L}_2^\text{loc}$, too. But this Q&A shows that it has a counterexample. Thus, I'll try to avoid this point and make a proof of our statement.
Since we have known that $F = F'$ a.e., it suffice to show that the following lemma.
Proof. For $n \in \mathbb{N}$, we define $\tau_n$ as follows. \[ \tau_n = T \land \inf \left\{ t \in [0, T] \mid \int_0^t \lvert \Phi(t) \rvert^2 dt > n \right\}. \] We see that $\tau_n$ is a stopping time to define \[ \Phi^{(n)}(t) = 1_{\{t \leq \tau_n\}} \Phi(t). \] By definition of $\tau_n$, it follows that \[ \int_0^T \lvert \Phi^{(n)}(t) \rvert^2 dt \leq n < \infty. \] Hence, $\Phi^{(n)} \in \mathcal{L}_2$. Therefore, we can apply Ito isometry for $\Phi^{(n)}$ to obtain \[ E \left[ \left\lvert \int_0^t \Phi^{(n)} (u) dW(u) \right\rvert^2 \right] = E \left[ \int_0^t \left\lvert \Phi^{(n)} (u) \right\rvert^2 du \right] \tag{3} \] for $t \in [0, T]$. Since $0 \leq \tau_n \nearrow T$ a.s., we can apply monotone convergence theorem twice to have the following equation. \[ \lim_{n \to \infty} E \left[ \int_0^t \left\lvert \Phi^{(n)} (u) \right\rvert^2 du \right] = E \left[ \int_0^t \left\lvert \Phi (u) \right\rvert^2 du \right]. \tag{4} \] By (1), we have \[ \int_0^t \Phi^{(n)}(u) dW(u) = \int_0^{t \land \tau_n} \Phi(u) dW(u) = 0 \tag{5} \] a.s. Note that this equation follows even for $\Phi \in \mathcal{L}_2^{\text{loc}}$. Combining (3)-(5), we see that \[ E \left[ \int_0^t \lvert \Phi(u) \rvert^2 du \right] = 0. \] Therefore, we obtain (2).