Assume that we specify a simple AR(1) model, with all the usual properties,
$$y_t = \beta y_{t-1} + u_t$$
Denote the theoretical covariance of the error term as
$$\gamma_j \equiv E(u_tu_{t-j})$$
If we could observe the error term, then the sample autocorrelation of the error term is defined as
$$\tilde \rho_j \equiv \frac {\tilde \gamma_j}{\tilde \gamma_0}$$
where
$$\tilde\gamma_j \equiv \frac 1n \sum_{t=j+1}^nu_tu_{t-j},\;\;\; j=0,1,2...$$
But in practice, we do not observe the error term. So the sample autocorrelation related to the error term will be estimated using the residuals from estimation, as
$$\hat\gamma_j \equiv \frac 1n \sum_{t=j+1}^n\hat u_t\hat u_{t-j},\;\;\; j=0,1,2...$$
The Box-Pierce Q-statistic (the Ljung-Box Q is just an asymptotically neutral scaled version of it) is
$$Q_{BP} = n \sum_{j=1}^p\hat\rho^2_j = \sum_{j=1}^p[\sqrt n\hat\rho_j]^2\xrightarrow{d} \;???\;\chi^2(p) $$
Our issue is exactly whether $Q_{BP}$ can be said to have asymptotically a chi-square distribution (under the null of no-autocorellation in the error term) in this model.
For this to happen, each and everyone of $\sqrt n \hat\rho_j$ must be asymptotically standard Normal. A way to check this is to examine whether $\sqrt n \hat\rho$ has the same asymptotic distribution as $\sqrt n \tilde\rho$ (which is constructed using the true errors, and so has the desired asymptotic behavior under the null).
We have that
$$\hat u_t = y_t - \hat \beta y_{t-1} = u_t - (\hat \beta - \beta)y_{t-1}$$
where $\hat \beta$ is a consistent estimator. So
$$\hat\gamma_j \equiv \frac 1n \sum_{t=j+1}^n[u_t - (\hat \beta - \beta)y_{t-1}][u_{t-j} - (\hat \beta - \beta)y_{t-j-1}]$$
$$=\tilde \gamma _j -\frac 1n \sum_{t=j+1}^n (\hat \beta - \beta)\big[u_ty_{t-j-1} +u_{t-j}y_{t-1}\big] + \frac 1n \sum_{t=j+1}^n(\hat \beta - \beta)^2y_{t-1}y_{t-j-1}$$
The sample is assumed to be stationary and ergodic, and moments are assumed to exist up until the desired order. Since the estimator $\hat \beta$ is consistent, this is enough for the two sums to go to zero. So we conclude
$$\hat \gamma_j \xrightarrow{p} \tilde \gamma_j$$
This implies that
$$\hat \rho_j \xrightarrow{p} \tilde \rho_j \xrightarrow{p} \rho_j$$
But this does not automatically guarantee that $\sqrt n \hat \rho_j$ converges to $\sqrt n\tilde \rho_j$ (in distribution) (think that the continuous mapping theorem does not apply here because the transformation applied to the random variables depends on $n$). In order for this to happen, we need
$$\sqrt n \hat \gamma_j \xrightarrow{d} \sqrt n \tilde \gamma_j$$
(the denominator $\gamma_0$ -tilde or hat- will converge to the variance of the error term in both cases, so it is neutral to our issue).
We have
$$\sqrt n \hat \gamma_j =\sqrt n\tilde \gamma _j -\frac 1n \sum_{t=j+1}^n \sqrt n(\hat \beta - \beta)\big[u_ty_{t-j-1} +u_{t-j}y_{t-1}\big] \\+ \frac 1n \sum_{t=j+1}^n\sqrt n(\hat \beta - \beta)^2y_{t-1}y_{t-j-1}$$
So the question is : do these two sums, multiplied now by $\sqrt n$, go to zero in probability so that we will be left with $\sqrt n \hat \gamma_j =\sqrt n\tilde \gamma _j$ asymptotically?
For the second sum we have
$$\frac 1n \sum_{t=j+1}^n\sqrt n(\hat \beta - \beta)^2y_{t-1}y_{t-j-1} = \frac 1n \sum_{t=j+1}^n\big[\sqrt n(\hat \beta - \beta)][(\hat \beta - \beta)y_{t-1}y_{t-j-1}]$$
Since $[\sqrt n(\hat \beta - \beta)]$ converges to a random variable, and $\hat \beta$ is consistent, this will go to zero.
For the first sum, here too we have that $[\sqrt n(\hat \beta - \beta)]$ converges to a random variable, and so we have that
$$\frac 1n \sum_{t=j+1}^n \big[u_ty_{t-j-1} +u_{t-j}y_{t-1}\big] \xrightarrow{p} E[u_ty_{t-j-1}] + E[u_{t-j}y_{t-1}]$$
The first expected value, $E[u_ty_{t-j-1}]$ is zero by the assumptions of the standard AR(1) model. But the second expected value is not, since the dependent variable depends on past errors.
So $\sqrt n\hat \rho_j$ won't have the same asymptotic distribution as $\sqrt n\tilde \rho_j$. But the asymptotic distribution of the latter is standard Normal, which is the one leading to a chi-squared distribution when squaring the r.v.
Therefore we conclude, that in a pure time series model, the Box-Pierce Q and the Ljung-Box Q statistic cannot be said to have an asymptotic chi-square distribution, so the test loses its asymptotic justification.
This happens because the right-hand side variable (here the lag of the dependent variable) by design is not strictly exogenous to the error term, and we have found that such strict exogeneity is required for the BP/LB Q-statistic to have the postulated asymptotic distribution.
Here the right-hand-side variable is only "predetermined", and the Breusch-Godfrey test is then valid. (for the full set of conditions required for an asymptotically valid test, see Hayashi 2000, p. 146-149).
The null hypothesis of the Ljung-Box test is that the autocorrelations (for the chosen lags) in the population from which the sample is taken are all zero. (See this thread for some more details on the test and the distribution of its statistic under the null.)
If your p-value is below your chosen significance level, you reject the null hypothesis in favour of the alternative that at least one of the autocorrelations is not zero in population.
When applying the test on model residuals, you wish not to reject as otherwise the model assumptions are likely to be violated (we typically assume the errors are not autocorrelated).
What does a p-value of 0 imply in a Ljung Box statistic?
When applied on model residuals, it implies the model errors are autocorrelated, and thus you might not trust the model output (if the model is built under the assumption of non-autocorrelated errors, which it normally is).
Edit: as indicated by Michael Chernick, a p-value would not be exactly zero as that would only happen if the test statistic were infinitely large. But the p-value could be close to zero for a sufficiently large statistic, and then the software could round it to zero.
But read this thread which warns against using the Ljung-Box test on residuals from ARMA models.
Best Answer
So you have applied the Ljung-Box test on your original data, first with the option
squared.residuals = FALSE
and thensquared.residuals = TRUE
. In both cases you found the test statistic exceeds the critical value, hence you reject the null hypothesis. The null hypothesis in the first case is "there is no autocorrelation up to lag 20" and in the second case "there is no autocorrelation in the squares up to lag 20". The first result thus suggests presence of autocorrelation, and the second suggests presence of autoregressive conditional heteroskedasticity. That is how you interpret the test results.