In line with what Macro suggested I think the term you are looking for is a performance measure.
Though it is not a safe way to asses predictive power, it is a very usefull way to compare the fitting quality of various models.
An example measure would be the Mean Average Percentage Error, but more of them can easily be found.
Suppose you use SetA with modelA to describe the number of holes in a road, and you use SetB and modelB to describe the number of people in a country, then of course you cannot say that one model is better than the other, but you can at least see which model provides a more accurate description.
You cannot use likelihood-based statistics like AIC to compare across models with different likelihood functions - the underlying formulas are different. In linear regression, the likelihood function is the normal density function, in Poisson regression it is the Poisson function. That will account for the differences in the AIC probably more than any differences in fit.
Before you decide to even use a linear model, you need to make sure that the residuals from the model are normally distributed (you can proxy that by looking at the distribution of the outcome variable, though keep in mind it isn't the same). If they are not normally distributed, or close enough for the eye, then you can't use a normal regression model to do any hypothesis testing.
Assuming that it is approximately normal, I would take a two broad approaches to choose the model to report.
1) Predicted outcomes. Estimate the predicted outcomes of each model and compare. Does the linear model have better predictive ability? You may want to do this in a cross-validation framework, where you "train" your model on part of your data and use the rest for prediction.
2) Intuitive interpretation of coefficients. Poisson coefficients can be complicated to understand - they are not the change in number of y but rather a proportional change. Depending on your context this may be more or less useful. Sometimes it is worth sacrificing fit if your model can be more easily interpreted by the end-user - for example, some researchers are willing to avoid the complexity of logit and probit models for the easier-to-interpret coefficients in a linear probability model, even though the LPM has tons of setbacks. Think about who your audience is, what is your context, what is your research question, etc., as you make these decisions.
EDIT: I forgot to add this paper, which gives a good comparison across a range of different count models and may be helpful.
Best Answer
Yes: model selection criteria, such as the BIC, the AIC, or the minimum length criterion, are commonly used in the literature to compare models based on their goodness of fit (and regularized for their complexity, ie for their number of free parameters).
Here, since the negative Binomial has 2 parameters (instead of only 1 for a Poisson distribution), it is going to be more penalized by the AIC and the BIC than the Poisson distribution.
However, the validity of these criteria rely on some strong assumptions, that you will need to verify and justify. For instance, using the BIC requires that your data are i.i.d., that you have enough of them, that you correctly obtained your Maximum Likelihood Estimators of the parameters of the models.
An interesting reference is Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: understanding AIC and BIC in model selection. Sociological methods & research, 33(2), 261-304.