Goodness of Fit: How to Measure a Trendline to a Power Law

goodness of fitpower law

I have some data to which I am trying to fit a trendline. I believe the data to follow a power law, and so have plotted the data on log-log axes looking for a straight line. This has resulted in an (almost) straight line and so in Excel I have added a trendline for a power law. Being a stats newb, my question is, what is now the best way for me to go from "well the line looks like it fits pretty well" to "numeric property $x$ proves that this graph is fitted appropriately by a power law"?

In Excel I can get an r-squared value, though given my limited knowledge of statistics, I don't even know whether this is actually appropriate under my specific circumstances. I have included an image below showing the plot of the data I am working with in Excel. I have a little bit of experience with R, so if my analysis is being limited by my tools, I am open to suggestions on how to go about improving it using R.

alt text

Best Answer

See Aaron Clauset's page:

which has links to code for fitting power laws (Matlab, R, Python, C++) as well as a paper by Clauset and Shalizi you should read first.

You might want to read Clauset's and Shalizi's blogs posts on the paper first:

A summary of the last link could be:

  • Lots of distributions give you straight-ish lines on a log-log plot.

  • Abusing linear regression makes the baby Gauss cry.
    Fitting a line to your log-log plot by least squares is a bad idea.

  • Use maximum likelihood to estimate the scaling exponent.
  • Use goodness of fit to estimate where the scaling region begins.
  • Use a goodness-of-fit test to check goodness of fit.
  • Use Vuong's test to check alternatives, and be prepared for disappointment.
Related Question