Solved – Modeling the shape of the power function

power lawr

In cognitive psychology, it has been argued that the learning curve of a skill follows the power function, in which the practice of the skill yields progressively smaller decrements in error. I want to test this hypothesis with the data I have, and to do this I need to fit a power law function to the data.

I heard that the power function can be fit in R by lm(log(y) ~ log(x)). An answer to this post, however, suggests using glm(y ~ log(x), family = gaussian(link = log)), and indeed the resulting fit prefers the glm approach (EDIT: See comments. This is not true.).

# Define a function to compute R-squared in linear space 
> rsq <- function(data, fit) {
+ ssres <- sum((data - fit)^2)
+ sstot <- sum((data - mean(data))^2)
+ rsq <- 1 - (ssres / sstot)
+ return(rsq)
}

# generate data and fit a power function with lm() and glm()
> set.seed(10)
> lc <- (1:10)^(-1/2) * exp(rnorm(10, sd=.25)) # Edited
> model.lm <- lm(log(lc) ~ log(1:10))
> model.glm <- glm(lc ~ log(1:10), family = gaussian(link = log))
> fit.lm <- exp(fitted(model.lm))
> fit.glm <- fitted(model.glm)

# draw a graph with fitted lines
> plot(1:10, lc, , xlab = "Practice", ylab = "Error rate", las = 1)
> lines(1:10, fit.lm, col = "red")
> lines(1:10, fit.glm, col = "blue")
> legend("topright", c("original data", "lm(log(y) ~ log(x))", 
+ "glm(y ~ log(x), family = gaussian(link = log))"), 
+ pch = c(1, NA, NA), lty = c(NA, 1, 1), col = c("black", "red", "blue"))

enter image description here

# compute R-squared values for both models
# (EDIT: See comments. These are not good measures to use.)
> rsq(lc, fit.lm)
[1] 0.9194631
> rsq(lc, fit.glm)
[1] 0.9205448

From the $R^2$ values, it seems that the glm model has a better fit. When the procedure was repeated 100 times, it was always the case that the glm model has a higher $R^2$ value than the lm model (EDIT: See comments. $R^2$ is not a good measure for this purpose).

At this point I have two questions.

  • What is the difference between the lm model and the glm model above? I thought a generalized linear model with the log link function is practically the same as regressing log(y) in general linear models.
  • I'm trying to compare the fit of the power function against other models (e.g., linear models as in lm(y ~ x)). To do this, I calculate the fitted values in linear space and compare the $R^2$ values computed in linear space, just like I did to compare lm and glm models above. Is this a proper way to compare this type of models?

My final question concerns the estimation of the four-parameter power law function given in this paper (p.186; I modified the notation a little).

$$
E(Y_N) = A + B(N + E)^{-β}
$$
where

  • $E(Y_N)$ = expected value of $Y$ on practice trial $N$
  • $A$ = expected value of $Y$ after learning is completed (i.e., asymptote)
  • $E$ = the amount of prior learning before $N = 0$

I have $N$ and $Y$ (1:10 and lc respectively in the code above), and want to estimate $A$, $B$, $E$, and $-β$ from the data. In the example above, I assumed $A$ = $E$ = 0 mostly because I wasn't sure how to estimate four parameters concurrently. While this isn't a huge problem, I would like to estimate $A$ and $E$ from the data as well if possible. So my final question:

  • Is there a way to estimate the four parameters in R? If not, is there a way to estimate three parameters, excluding $A$ from the above (i.e., assuming $A$ = 0)?

The paper I linked above says the authors used the simplex algorithm to estimate the parameters. But I am familiar with neither the algorithm nor the simplex function in R.

I'm aware of Clauset et al's work and the poweRlaw package in R. I believe both of them are dedicated to modeling the distribution of one categorical variable, but neither models the curve of the power law.

Any help is appreciated.

Best Answer

You can to use AIC/BIC values to compare the model fits to each other instead of R-squared values. The smaller AIC value is the better fit. It can sometimes be helpful to plot the AIC values as the difference from the bet fit model because the units of AIC are massive and arbitrary (I think), so it is the relative difference in AIC that you want to show; however it is helpful to have a table of AIC values as supplementary information.

I cant offer any advice on how to get AIC values in R as that is what I was looking for when I came across your post.