You can basically only compare AIC scores when you use the same dataset. Within the same dataset you can't even compare AIC scores if let's say one model is fitted on 70 records and the other model on 69 records (because of a missing value in one of the variables).
Can't you combine all the datasets into one single large dataset? Then you can compare your different models without any problems while using AIC scores.
The form of $AICc$ of
$$
AICc = AIC + \frac{2k(k+1)}{n-k-1}
$$
was proposed by
Hurvich, C. M.; Tsai, C.-L. (1989), "Regression and time series model selection in small samples", Biometrika 76: 297–307
specifically for a linear regression model with normally distributed errors. For different models, a different correction will need to be derived.
These derivations are often difficult and the resulting correction may be challenging to calculate. For instance
Hurvich, Clifford M., Jeffrey S. Simonoff, and Chih‐Ling Tsai. "Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60, no. 2 (1998): 271-293.
propose a correction to be used in the case of nonparametric regression models which takes the form
$$
AICc = -2ln(L) + n^2\int_0^1(1-t)^{r/2-2}\prod_{j=1}^{r}(1-t+2d_j)^{-1/2}dt+n\int_0^{\infty}\sum_{i=1}^n\frac{c_{ii}}{1+2d_it}\prod_{i=1}^n(1+2d_it)^{-1/2}dt
$$
I will not go into the details here as they are largely irrelevant but I wanted to illustrate the complexity involved. Actual calculation of this value involves eigen-analysis and numerical integration.
For reasons like this, many authors such as
Burnham, K. P.; Anderson, D. R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag, ISBN 0-387-95364-7
suggest to use the form
$$
AICc = AIC + \frac{2k(k+1)}{n-k-1}
$$
regardless of model. Even Hurvich et al. (1998) despite deriving their complicated $AICc$ for nonparametric regression ultimately conclude that you might as well use the much simpler version for linear regression.
Generally this advice seems to work well, giving practically useful results. However there are circumstances, such as the one you've highlighted where it doesn't work. You would need to find an appropriate $AICc$ for k-means, or derive one yourself, or simply use $AIC$ which is more generally applicable.
Best Answer
I wish I understood the paper better, but if you look at Hurvich & Tsai (1989) equation 3, they are defining the AIC itself as: $$ \textrm{AIC} = n(\log\hat{\sigma}^2 + 1) + 2\left(m + 1\right) $$
Which, naïvely implies $k = m+1$ and then the Hurvich & Tsai and Anderson et al post 1999 are actually one and the same as $$ (m+1 = k) \implies \frac{2(m+1)(m+2)}{n−m-2} \equiv \frac{2(k)(k+1)}{n−k−1} $$
Edit - (Cavanaugh 1997)
See (Cavanaugh 1997) (pdf), specifically page 203, where in the derivation he is setting $k = p+1$, for it is as @Glen_b said, $k$ includes the error variance and $p$ does not.
Reference: Cavanaugh, J. E. Unifying the derivations for the Akaike and corrected Akaike information criteria Statistics & Probability Letters, 1997, 33, 201-208