Solved – Adding additional terms to a piecewise regression

piecewise linearrregressionsegmented regression

I am exploring the probability of flight in a seabird (1=flight, 0=no flight) using binomial logistic regression. My predictors are distance to a disturbance (continuous), hour of the day (continuous), site (factor), season (factor), sea state (dichotomous), and group size (dichotomous). I have explored the use of piecewise regression in relation to the distance to a disturbance as this variable spans a large range (out to 74 km) and there is no way that this is affecting flight at the largest distance.

When the model was fit with just reference to distance to a disturbance within the R program 'segmented' it points to a break in the data at 3.9 km. The slope up to this distance is negative and statistically significant while the slope estimate for distances further than 3.9 km is estimated to be 0 and non-significant.

I would like to now sequentially add in additional terms to the model to see if there is any reduction in the deviance when the additional terms are added. Can a term be added just to the section before or after the break? I cannot seem to find any information in the literature regarding this

My questions is can I do this? Or do I need to split the data into two chunks, before and after the breakpoint and explore additional terms this way.

Also the motivation to do this analysis is more to find and identify the breakpoint. Instead of adding in terms after I assess the breakpoint should I explore the breakpoint within a the model including all the terms? Would this find the break in the data in relation to the other terms or does the algorithm completely ignore the other terms in the model when searching for a break in the distance to disturbance variable.

Thanks,

Best Answer

Can a term be added just to the section before or after the break?<

Sure. You just use an interaction term with the dummy variable for break (0 before break, 1 after break) --> variable*dummy But are you sure you want to do this? You have the domain knowledge, but it is sort of like including the interaction without the main effect. Most would do variable + variable*dummy, this does make interpretation harder but usually seems reasonable to assume there is some effect prior to breakpoint

Instead of adding in terms after I assess the breakpoint should I explore the breakpoint within a the model including all the terms?

As far as I remember the segmented package allows you to do both. In my mind both are reasonable. If it is easy would prefer to do the latter, but often doesn't matter.

I have explored the use of piecewise regression in relation to the distance to a disturbance

I like piece-wise regression. But if you're trying to incorporate non-linearity in your model, some prefer GAM or restricted cubic splines. If this is for publication, I'd look at what the standard is in your field.

Related Question