Linear Regression – Predicting Magnitude from Angle in Linear Regression Models

circular statistics

I have some data where I want to see if the magnitude of an effect depends on the direction. (Analogous to asking if wind speed depends on wind direction, for example.) In the circular package there is lm.circular which allows doing regression using circular data. However, it only allows the independent variable to be 'linear' (i.e. non-circular), while what I need is the dependent variable to be linear.

I've seen one answer on Cross Validated that basically suggests to look at some cardinal directions. This could work – or I could use some bootstrapping / confidence intervals approach, but I was wondering if there is an existing test for this type of problem.

Best Answer

Here, we want to predict a linear dependent variable from circular independent variables. There are several ways to approach this. The main thing to check is whether the relation between your dependent variable (let's say $Y$) and the circular predictor (say $\theta$) has a sinusoidal shape. This is often the case, but not necessarily. Below is an example of data of this shape.

th  <- rnorm(100, 1, 4) %% (2*pi)
err <- rnorm(100, mean = 0, sd = 0.8)
icp <- 10

bc <- 2
bs <- 3

y   <- icp + bc * cos(th) + bs * sin(th) + err

plot(th, y)

Sinusoidal relationship between $\theta$ and $Y$.

If the data does have this shape, roughly, a good simple model for the data is then given by splitting the circular predictor $\theta$ up in a sine and a cosine component, and running a regular linear regression on these two components, in this case by:

lm(y ~ cos(th) + sin(th))

>Call:
>lm(formula = y ~ cos(th) + sin(th))
>
>Coefficients:
>(Intercept)      cos(th)      sin(th)  
>      10.12         2.04         2.95 

Of course, this can be done for multiple predictors as well. A good introduction on this may be found in Pewsey, Neuhauser & Ruxton (2013), Circular Statistics in R.

As mentioned before, we may add terms as in a Fourier regression, but this can only be recommended if the relationship structurally exhibits very different forms, because higher-order Fourier regression introduces, IIRC, a large number of difficult to interpret parameters.

Related Question