If you have read Burnham & Anderson's monograph, you know just why they discourage AIC(c)-based model selection: because they subscribe to the theory of tapering effect sizes. In a nutshell, they posit that everything has an effect - it's just that most effects are pretty small (sort of a "long tail"). Thus, an AIC(c)-selected model may be more parsimonious, but it will be systematically too small (the bias-variance trade-off). Therefore they recommend averaging models.
This is also the reason why statistical significance and p values are not en vogue in the Burnham & Anderson worldview. Tapering effect sizes are another way of saying that the true coefficients are almost always nonzero, just perhaps very small. Thus, the null hypothesis is already false a priori. P values pose a question that we already know the answer to.
Thus, if you follow B&A's philosophy far enough that you do AICc-based model averaging, it seems a bit incongruous to also discuss p values and/or "marginal significance".
Now, one possibility would be to simply discuss "averaged coefficients" and their CIs, without even discussing whether CIs contain zero. Conversely, if you are in a field that deifies p values (like psychology), it may make more sense to disregard these implications of B&A in the interest of talking in a way your readers will understand, rather than follow strict AICc purity.
(Anyway, my impression is that AICc and B&A have more of a following among non-statisticians, especially ecologists. So the nuances we are discussing here may already be far away from your readership's main interests.)
When the number of observations is large the Akaike Information Criterion (AIC) and the small-sample corrected Akaike Information Criterion (AICc) become extremely similar because AICc converges to AIC. Therefore we gain (or lose) almost nothing by switching between the two criteria. I suggest keeping AICc for consistency throughout an analysis.
Some further discussion: AIC expresses the relative expected Kullback–Leibler information $I$ between competing models. Assuming our model's density is $f_M$ and the real model is $g$ the KL information can be expressed as:
$$I(g,f_M) = \int g(x) \log(\frac{g(x)}{f_M(x;\theta)})dx$$
Notice that this very much like a likelihood ratio test; if $f_M$ and $g$ are the same the ratio $\frac{g(x)}{f_M(x;\theta)}$ equals 1 so the logarithm of it tends to 0. We can immediately re-write the above as:
$$I(g,f_M) = \int g(x) \log g(x)dx - \int g(x) \log f_M(x;\theta)dx$$
and realise that the first terms is constant, so we just care for:
$$ - \int g(x) \log f_M(x;\theta)dx$$
Now what Akaike did was to: 1. realise that while $g$ is unknown we do have observations from $g$ in terms of $X_1, X_2, \dots, X_n$. So:
$$ - \int g(x) \log f_M(x;\theta)dx \approx -\frac{1}{n}\sum_i^n \log(f_M(X_i;\theta))$$
(which is simply the negative log-likelihood for the model $M$) and 2. realise that this is a over-fitted estimate of the log-likehood as we estimate both $f_M$ and $\theta$ from the same data. Without going to further gory details, the bias is asymptotically equal to $\frac{p}{n}$ where $p$ is the number of estimated parameters by $M$. So actually what we care for is:
$$ -\frac{1}{n} \sum_{i=1}^n \log(f_M(X_i;\theta)) + \frac{p}{n}$$
where is we multiply this by $2n$ we get the AIC for model $M$:
\begin{align}
AIC(M) &= -2\sum_{i=1}^n \log(f_M(X_i;\theta)) + 2p \\
&= -2 l + 2p
\end{align}
So the AIC equates minus two times the maximized log-likelihood plus two
times the number of estimated parameters. Hurvich and Tsai's Regression and time series model selection in small samples (2001) further showed that this corrected estimate is still biased if $n$ is not large enough. Their correction terms is $\frac{2p(p+1)}{n -(p+1)}$ and this leads to the AICc formula as:
\begin{align}
AICc(M) = -2 l + 2p + \frac{2p(p+1)}{n -p -1}
\end{align}
That is why AICc (second-order AIC) is advocated when sample size is relatively low; clearly as $\frac{n}{p}$ get large this later correction term tends to 0. Burnham and Anderson in Model Selection and Multi-Model Inference (2004) suggest using AICc when the ratio between the sample size $n$ and the number of parameters $p$ in the largest candidate model is small (<40) but realistically any difference between AIC and AICc will be negligible as $n$ gets large (eg. >100). I have found Takezawa's Learning Regression Analysis by Simulation (2014) Chapt. 5 "Akaike’s Information Criterion (AIC) and the Third Variance" a great resource on the matter too.
Best Answer
You need to use the likelihood for the whole sample from the first principles based on $$ \log L \sim -\frac12 ({\bf y}-\mathbf{\mu})'\Sigma(\theta)^{-1}({\bf y}-\mathbf{\mu}) - \frac12 \log |\Sigma(\theta)| $$ where ${\bf y}\in \mathbf{R}^{135} $ is your whole sample vector, and $\Sigma(\theta)$ is the model-implied covariance matrix of your ARMA($p,q$) process. God only knows what your naive AIC calculation for i.i.d. data is actually doing; it is out of context and has little value here.