Solved – Interpreting circular-linear regression coefficient

circular statisticsrregression

I'm trying to use the circular package in R to perform regression of a circular response variable and linear predictor, and I do not understand the coefficient value I'm getting. I've spent considerable time searching in vain for an explanation that I can understand, so I'm hoping somebody here may be able to help.

Here's an example:

library(circular)

# simulate data
x <- 1:100
set.seed(123)
y <- circular(seq(0, pi, pi/99) + rnorm(100, 0, .1))

# fit model
m <- lm.circular(y, x, type="c-l", init=0)

> coef(m)
[1] 0.02234385

I don't understand this coefficient of 0.02 — I would expect the slope of the regression line to be very close to pi/100, as it is in garden variety linear regression:

> coef(lm(y~x))[2]
         x
0.03198437

Does the circular regression coefficient not represent the change in response angle per unit change in the predictor variable? Perhaps the coefficient needs to be transformed via some link function to be interpretable in radians? Or am I thinking about this all wrong? Thanks for any help you can offer.

Best Answer

See the documentation:

help(lm.circular)

"If type=="c-l" or lm.circular.cl is called directly, this function implements the homoscedastic version of the maximum likelihood regression model proposed by Fisher and Lee (1992). The model assumes that a circular response variable theta has a von Mises distribution with concentration parameter kappa, and mean direction related to a vector of linear predictor variables according to the relationship: mu + 2*atan(beta'*x), where mu and beta are unknown parameters, beta being a vector of regression coefficients. The function uses Green's (1984) iteratively reweighted least squares algorithm to perform the maximum likelihood estimation of kappa, mu, and beta. Standard errors of the estimates of kappa, mu, and beta are estimated via large-sample asymptotic variances using the information matrix. An estimated circular standard error of the estimate of mu is then obtained according to Fisher and Lewis (1983, Example 1)."

Thus you should compare with a different model

> nls(y~a+2*atan(b*x),start=c(a=0.06337,b=0.022344),data=list(x=x,y=y))
Nonlinear regression model
  model: y ~ a + 2 * atan(b * x)
   data: list(x = x, y = y)
      a       b 
0.07112 0.02231 
 residual sum-of-squares: 12.36

Number of iterations to convergence: 12 
Achieved convergence tolerance: 5.838e-06

this 'nls' function does not use the same underlying distributions for the residual terms but does provide similar coefficients.


Clearly you made your posted problem very simplified in order to make it easier to be understood.

Could you add your real case? (to spice up the question)