In cognitive psychology, it has been argued that the learning curve of a skill follows the power function, in which the practice of the skill yields progressively smaller decrements in error. I want to test this hypothesis with the data I have, and to do this I need to fit a power law function to the data.
I heard that the power function can be fit in R by lm(log(y) ~ log(x))
. An answer to this post, however, suggests using glm(y ~ log(x), family = gaussian(link = log))
, and indeed the resulting fit prefers the (EDIT: See comments. This is not true.).glm
approach
# Define a function to compute R-squared in linear space
> rsq <- function(data, fit) {
+ ssres <- sum((data - fit)^2)
+ sstot <- sum((data - mean(data))^2)
+ rsq <- 1 - (ssres / sstot)
+ return(rsq)
}
# generate data and fit a power function with lm() and glm()
> set.seed(10)
> lc <- (1:10)^(-1/2) * exp(rnorm(10, sd=.25)) # Edited
> model.lm <- lm(log(lc) ~ log(1:10))
> model.glm <- glm(lc ~ log(1:10), family = gaussian(link = log))
> fit.lm <- exp(fitted(model.lm))
> fit.glm <- fitted(model.glm)
# draw a graph with fitted lines
> plot(1:10, lc, , xlab = "Practice", ylab = "Error rate", las = 1)
> lines(1:10, fit.lm, col = "red")
> lines(1:10, fit.glm, col = "blue")
> legend("topright", c("original data", "lm(log(y) ~ log(x))",
+ "glm(y ~ log(x), family = gaussian(link = log))"),
+ pch = c(1, NA, NA), lty = c(NA, 1, 1), col = c("black", "red", "blue"))
# compute R-squared values for both models
# (EDIT: See comments. These are not good measures to use.)
> rsq(lc, fit.lm)
[1] 0.9194631
> rsq(lc, fit.glm)
[1] 0.9205448
From the $R^2$ values, it seems that the (EDIT: See comments. $R^2$ is not a good measure for this purpose).glm
model has a better fit. When the procedure was repeated 100 times, it was always the case that the glm
model has a higher $R^2$ value than the lm
model
At this point I have two questions.
- What is the difference between the
lm
model and theglm
model above? I thought a generalized linear model with the log link function is practically the same as regressing log(y) in general linear models. - I'm trying to compare the fit of the power function against other models (e.g., linear models as in
lm(y ~ x)
). To do this, I calculate the fitted values in linear space and compare the $R^2$ values computed in linear space, just like I did to comparelm
andglm
models above. Is this a proper way to compare this type of models?
My final question concerns the estimation of the four-parameter power law function given in this paper (p.186; I modified the notation a little).
$$
E(Y_N) = A + B(N + E)^{-β}
$$
where
- $E(Y_N)$ = expected value of $Y$ on practice trial $N$
- $A$ = expected value of $Y$ after learning is completed (i.e., asymptote)
- $E$ = the amount of prior learning before $N = 0$
I have $N$ and $Y$ (1:10
and lc
respectively in the code above), and want to estimate $A$, $B$, $E$, and $-β$ from the data. In the example above, I assumed $A$ = $E$ = 0 mostly because I wasn't sure how to estimate four parameters concurrently. While this isn't a huge problem, I would like to estimate $A$ and $E$ from the data as well if possible. So my final question:
- Is there a way to estimate the four parameters in R? If not, is there a way to estimate three parameters, excluding $A$ from the above (i.e., assuming $A$ = 0)?
The paper I linked above says the authors used the simplex algorithm to estimate the parameters. But I am familiar with neither the algorithm nor the simplex
function in R.
I'm aware of Clauset et al's work and the poweRlaw
package in R. I believe both of them are dedicated to modeling the distribution of one categorical variable, but neither models the curve of the power law.
Any help is appreciated.
Best Answer
You can to use AIC/BIC values to compare the model fits to each other instead of R-squared values. The smaller AIC value is the better fit. It can sometimes be helpful to plot the AIC values as the difference from the bet fit model because the units of AIC are massive and arbitrary (I think), so it is the relative difference in AIC that you want to show; however it is helpful to have a table of AIC values as supplementary information.
I cant offer any advice on how to get AIC values in R as that is what I was looking for when I came across your post.