You can. By using partitioned-regression results (F-W-L theorem), if the model includes a constant term,
$$y = \alpha + \mathbf x' \beta + u$$
then the "slope coefficients" are in practice calculated by OLS as
$$\hat{\beta}_{OLS} = (\hat {\tilde X}'\hat {\tilde X})^{-1}\hat {\tilde X}'\hat {\tilde y}= \beta + (\hat {\tilde X}'\hat {\tilde X})^{-1}\hat {\tilde X}'u$$
Where $\hat {\tilde X} = X - \bar X$ (sample mean) and for later use $ {\tilde X} = X-E(X)$. Again, here $X$ does not include a series of ones (but the model does). In other words, whether you demean a priori or not, OLS will demean the variables automatically to estimate the slope coefficients, if you also include a constant term in the model.
If you multiply and divide the above by sample size, you get the multivariate analog of $(2)$ since the variables are now in sample mean-deviation form and so these are estimated variance and covariance matrices (and in $(2)$ you should use hats, by the way).
Then
$$\text{plim}\big( \hat{\beta}_{OLS} -\beta\big) = \text{plim}\left (\frac 1n\hat {\tilde X}'\hat {\tilde X}\right)^{-1}\text{plim}\left(\frac 1n \hat {\tilde X}'u\right)$$
The Law of Large Numbers does not "jump from probability limits to expectations": it is the very essence of the Law that the probability limit is the expected value. Then (under the necessary conditions)
$$\text{plim}\big( \hat{\beta}_{OLS} -\beta\big) = E\left (\frac 1n\tilde X'\tilde X\right)^{-1}E\left(\frac 1n \tilde X'u\right)$$
Now the variables are in deviations from their true expected values, and these expected values are the true covariance matrices, and you can write for example,
$$\text{plim}\big( \hat{\beta}_{OLS} -\beta\big) = [ \text {Var}(\mathbf x)]^{-1}\cdot \text{Cov}(\mathbf x \cdot u)$$
which is the probability limit of the multivariate analogue of $(2)$.
The mistake is, if $\text{plim}N^{-1}X'u = 0$, then we only need to assume $\text{plim}N^{-1}(X'X)^{-1}$ exists to prove consistency. But when $\text{plim}N^{-1}X'u \ne 0$, as is in the case when $x_k$ is endogenous (suppose $x_k$ is the last variable), we will need to check $\text{plim}N^{-1}(X'X)^{-1}$ to see if other $\beta$s are consistent.
If all other variables are orthogonal to $x_k$, then other $\beta$s will still be consistent. This is because the limit matrix of $(X'X)^{-1}$ will contain 0s for the last column and last row except the diagonal terms. Then right multiplied by $(0,\cdots,0,\epsilon)'$ gives 0 for other elements except $k$th.
Even in the case that only one variable $x_q$ is correlated with $x_k$, all other variables are not, but if some other variable $x_p$ is correlated with $x_q$, then we should not expect $\beta_p$ to be consistent. Intuitively but loosely, inconsistency spread through correlation.
Best Answer
Note that $$ \text{plim} \Big[(X'X + \lambda I_k)^{-1} X'X\Big] =\text{plim}(n^{-1}X'X + n^{-1}\lambda I_k)^{-1}\text{plim}(n^{-1}X'X)$$
The second plim converges by asumption. For the first we have $$\text{plim}(n^{-1}X'X + n^{-1}\lambda I_k)^{-1}=\left(\text{plim}n^{-1}X'X + \text{plim}n^{-1}\lambda I_k\right)^{-1} $$
and that
$$\text{plim}n^{-1}\lambda I_k = \text{lim}n^{-1}\lambda I_k = 0$$
leading to the desired consistency result. Intuitively the purpose of adding a term like $\lambda I_k$ is to handle a "bad sample", i.e. it is a finite-sample "tactic" to get results, but whose effect is eliminated asymptotically.