Goodness of Fit: How to Measure a Trendline to a Power Law

goodness of fitpower law

I have some data to which I am trying to fit a trendline. I believe the data to follow a power law, and so have plotted the data on log-log axes looking for a straight line. This has resulted in an (almost) straight line and so in Excel I have added a trendline for a power law. Being a stats newb, my question is, what is now the best way for me to go from "well the line looks like it fits pretty well" to "numeric property $x$ proves that this graph is fitted appropriately by a power law"?

In Excel I can get an r-squared value, though given my limited knowledge of statistics, I don't even know whether this is actually appropriate under my specific circumstances. I have included an image below showing the plot of the data I am working with in Excel. I have a little bit of experience with R, so if my analysis is being limited by my tools, I am open to suggestions on how to go about improving it using R.

alt text

Best Answer

See Aaron Clauset's page:

Power-law Distributions in Empirical Data

which has links to code for fitting power laws (Matlab, R, Python, C++) as well as a paper by Clauset and Shalizi you should read first.

You might want to read Clauset's and Shalizi's blogs posts on the paper first:

A summary of the last link could be:

Lots of distributions give you straight-ish lines on a log-log plot.

Abusing linear regression makes the baby Gauss cry.
Fitting a line to your log-log plot by least squares is a bad idea.

Use maximum likelihood to estimate the scaling exponent.

Use goodness of fit to estimate where the scaling region begins.

Use a goodness-of-fit test to check goodness of fit.

Use Vuong's test to check alternatives, and be prepared for disappointment.

Best Answer

Related Solutions

Solved – Interpreting the difference between lognormal and power law distribution (network degree distribution)

Solved – Connection between power law and Zipf’s law

Related Question