Solved – Modeling mortgage prepayment behaviour using survival analysis

cox-modelrecurrent-eventssurvival

I am doing a study of prepayment behaviour on mortgage loans where the partial prepayment (curtailments) are very frequent and sort of normal behaviour as there are no penalties for such events.

As a result of partial prepayments the final maturity of the contract can change (does not have to necessarily). As partial prepayments are recurrent events this poses a problem as the maturity should in principle decrease as more partial prepayment events occur. So we have recurrent events increasing with time and the final maturity of contract decreasing given the prepayments.

The event of interest is the time to termination of the contract or at least the time to a decrease in maturity of the contract (if the contract is still not closed).

So we are facing with increasing event of prepayment but decreasing times of maturity. Hence, the time to event is decreasing with time which is not technically ok. So I though to take only the last change of maturity conditional on the actual event happening. If there is no event then the observation is censored.

I am concerned with the following:
1. does this violate the general premises of using survival analysis?
2. How to treat censored events as they will definitely have a shorter time period than the uncensored event? This poses the problem in defining the risk data set which will inevitably have zero hazard rate at the beginning and will increase gradually much later on with time.

I would very much appreciate any suggestions how else to tackle this type of a problem.

Best Answer

I can suggest looking into models for recurrent events with a dependent terminal event. Indeed the terminal event (time to termination of the contract) is a dependent censoring for the recurrent events process, so the usual assumption of "independent censoring" does not hold.

The basic idea in tackling this is to use random effects (frailty). With this, you would have an intensity (hazard) for the recurrent events which depends on the random effect $z_i$: $$ h_i(t|z_i) = z_i h_0(t) \exp(\beta'x_i) $$ and a hazard for the terminal event $$ \lambda_i(t|z_i) = z_i ^\alpha \lambda_0(t) \exp(\gamma'x_i) $$ with $\beta$ and $\gamma$ regression coefficients. This implies that when $\alpha<0$ a high rate of recurrent events is associated with a lower rate of terminal events, and when $\alpha>0$ a high rate of recurrent events is associated with a high rate of terminal events.

This is implemented in the R package frailtypack, which is available on CRAN. There you can specify $h_0$ and $\lambda_0$ in several forms (splines, constant, Weibull, etc). Perhaps this might give you a starting point.

For reading, a good starting point would be chapter 6.6 from The Statistical Analysis of Recurrent Events by Cook and Lawless. Good luck!

Related Question