Solved – Fit power law for distributions with zeroes

distributionsfittingpower law

I am pretty new to statistics and have some data that I think may follow a power-law distribution. However, it includes zeroes. I understand that mathematically zeroes can't work, but conceptually, would the point of a power law be violated if there are some zeroes in the data? If I were to scale the data by adding 1, would that be fine, or would that throw off everything?

I am sure that this is a very newbie question, but that's what I am, and I would love some help in understanding how power laws work. If the distribution can't be power-law, how can I plot it (in R) to best show its characteristics? Would a log-log plot work?

Thank you in advance.

Best Answer

I guess your "this is a very newbie question" refers to this of your many questions:

"...but conceptually, would the point of a power law be violated if there 
are some zeroes in the data?"**

No. The concept remains valid as the same class of distributions may be applied to data with or without zeros. You may be interested in reading more about Tweedie class of distributions here and then here.

For example, the well-known Taylor’s law says that the variance is proportional to a power of the mean. Taylor’s law is mathematically identical to the variance-to-mean power law that characterizes the Tweedie distributions, that is for any random variable that obeys a Tweedie distribution, the variance relates to the mean by the power law. Since that "any random variable" can be discrete, continuous or a combination of both, the concept of the power law may equally apply to data that are counts (Poisson), reals (Normal), positive reals (Gamma), or positive reals with the added positive mass at zero (compound Poisson–gamma).

Given your "there are some zeros in the data" and your comment "yes, my values are counts", simple Poisson may work. If not, e.g. zeros are too few or too many, you may try Neyman Type A distribution (this R package manual mentions it the context of the Tweedie class of distributions).

I hope some of the above helps.