Solved – Interpreting circular-linear regression coefficient

circular statisticsrregression

I'm trying to use the circular package in R to perform regression of a circular response variable and linear predictor, and I do not understand the coefficient value I'm getting. I've spent considerable time searching in vain for an explanation that I can understand, so I'm hoping somebody here may be able to help.

Here's an example:

library(circular)

# simulate data
x <- 1:100
set.seed(123)
y <- circular(seq(0, pi, pi/99) + rnorm(100, 0, .1))

# fit model
m <- lm.circular(y, x, type="c-l", init=0)

> coef(m)
[1] 0.02234385

I don't understand this coefficient of 0.02 — I would expect the slope of the regression line to be very close to pi/100, as it is in garden variety linear regression:

> coef(lm(y~x))[2]
         x
0.03198437

Does the circular regression coefficient not represent the change in response angle per unit change in the predictor variable? Perhaps the coefficient needs to be transformed via some link function to be interpretable in radians? Or am I thinking about this all wrong? Thanks for any help you can offer.

Best Answer

See the documentation:

help(lm.circular)

"If type=="c-l" or lm.circular.cl is called directly, this function implements the homoscedastic version of the maximum likelihood regression model proposed by Fisher and Lee (1992). The model assumes that a circular response variable theta has a von Mises distribution with concentration parameter kappa, and mean direction related to a vector of linear predictor variables according to the relationship: mu + 2*atan(beta'*x), where mu and beta are unknown parameters, beta being a vector of regression coefficients. The function uses Green's (1984) iteratively reweighted least squares algorithm to perform the maximum likelihood estimation of kappa, mu, and beta. Standard errors of the estimates of kappa, mu, and beta are estimated via large-sample asymptotic variances using the information matrix. An estimated circular standard error of the estimate of mu is then obtained according to Fisher and Lewis (1983, Example 1)."

Thus you should compare with a different model

> nls(y~a+2*atan(b*x),start=c(a=0.06337,b=0.022344),data=list(x=x,y=y))
Nonlinear regression model
  model: y ~ a + 2 * atan(b * x)
   data: list(x = x, y = y)
      a       b 
0.07112 0.02231 
 residual sum-of-squares: 12.36

Number of iterations to convergence: 12 
Achieved convergence tolerance: 5.838e-06

this 'nls' function does not use the same underlying distributions for the residual terms but does provide similar coefficients.

Clearly you made your posted problem very simplified in order to make it easier to be understood.

Could you add your real case? (to spice up the question)

Related Solutions

Solved – Improving a linear regression prediction

I think we'd like the subject narrowed down a bit too; same concerns here about our time and effectiveness. Do edit in more detail if you can. That being said, here's a first attempt at narrowing things down for you somewhat. I hate to do this to you, but I'm still going to end up tossing jargon at you that you might have to read into a bit to understand. (Hovering your cursor over our tags may be enough, hopefully!)

With data that large that's all aimed at predicting one thing, overfitting (overfitting) may pose one of the bigger problems for your predictive model, especially given high multicollinearity (multicollinearity) among some of your predictors. Using principal-components regression (PCR) should be a good way to handle multicollinearity, assuming you or your software exclude(s) the principal components with trivially small eigenvalues relative to the total sum of eigenvalues. "What's trivial?" may be a difficult question, but if you're lucky, you'll find natural gaps. Rank order your principal components by eigenvalue, and look for sharp drops in the size of each eigenvalue compared to the next smallest. In a relatively simple scenario, you'd want to use all the principal components before the last big drop in eigenvalue. You'll probably have a lot of relatively high eigenvalues that drop off gradually, but if you're lucky, it'll look something like this:

which is a relatively clear case for retaining three factors (bifactor analysis aside). Note @Scortchi's comment on this answer though: you'll want to be careful about throwing out PCs that are doing some real predictive work for your model, even if they have really small eigenvalues.

continuous-data is important in fitting a linear model, because binary data generally require logistic regression. If, as it sounds, your DV is continuous, and your ellipses aren't curvilinear, discontinuous, pear-shaped, etc., but just smoothly elliptical like variably elongated footballs (in the American sense), you're probably right to go with a linear model, though I don't know that you can really just eyeball this sort of thing. Run some basic regression diagnostics if you know or can figure out how. If any of your predictors are categorical-data, PCR probably won't know this, so you'll effectively be using them as approximations of continuous dimensions, which may not be safe, especially if they are nominal, or there are less than five (approximate rule of thumb) ordinal categories, or you don't actually have any reason to expect a normal distribution underlies your system of ordinal categories.

You may want to throw out the relatively useless predictors, which are conventionally identified by $t$-testing the regression coefficients. If the coefficients don't differ significantly from zero, they may be adding more error than information about the DV to your model's predictions. Lots of better ideas on how exactly to test which predictors to retain when you've got so many can be found in the discussion, Is adjusting p-values in a multiple regression for multiple comparisons a good idea? @whuber's suggestion to hold out some data for model validation is particularly straightforward and convincing in ways I think you'll find appealing.

If you were to care enough about what your model looks like, particularly in terms of how those principal components of your predictors organize themselves, you might consider piecing together a structural equation model (sem) of your own design. If you could model the latent factor structure of your set of predictors manually and accurately before using the latent factors to predict your DV, you could remove measurement-error from the factors in advance of doing the predictive modeling with them, and probably gain a better understanding of your model in the process. This could also let you identify mediation among your model's predictors, depending on how you organize it. I don't suppose you'd be inclined to care about that if you're only interested in prediction at the moment (and I don't mean to assume that you're wrong not to care), but if you ever find you need to explain how you're getting those predictions and why you think they're valid, you might have to revisit a lot of this when/if you do start caring. Therefore a little preemptive caring might be advisable, even if there's really no immediate reason! Then again, maybe you'll have more time later and be better able to afford starting over then, if necessary. Your call, your risk. Happy modeling, and may the trashy fiction be ever the result of someone else's work!

Circular Statistics – Regression and Correlation of Wind Direction Data

You may use the circular package. It's probably best to convert your data to radians by multiplying by $\pi/180$, or using xcirc <- rad(x).

The type, unit, zero, modulo flags are mostly there for plotting and storing information about a dataset. For your analysis, they don't matter.

Here is some example analysis, adapted from example(lm.circular)

# Example data
x <- runif(n, 0, 2*pi)
y <- atan2(0.15*cos(x) + 0.25*sin(x), 0.35*sin(x)) + rnorm(n, 0, 1)

# Compute the model, get a circular correlation.
circ.lm <- lm.circular(y, x, order=1, type = "c-c")
cor.circular(y=y, x=x)

# Plot the results.
plot.default(x, y)
circ.lm$fitted[circ.lm$fitted>pi] <- circ.lm$fitted[circ.lm$fitted>pi] - 2*pi 
points.default(x[order(x)], circ.lm$fitted[order(x)], type='l')

Best Answer

Related Solutions

Solved – Improving a linear regression prediction

Circular Statistics – Regression and Correlation of Wind Direction Data

Related Question