Solved – Predict sales levels with decision trees

cartforecastingrregressionseasonality

I need to build a model using climate variables (temperature, rainfall) to predict
monthly sales (horizon of 6 months) for certain product. The data has strong seasonality and a standard regression model would works fine, the problem is that the historic data will not be updated, meaning that the observed data points will not be incorporated into the model.

Whats a good way to solve this? What if i split the sales data into levels (say 'WEAK', 'NORMAL', 'HIGH', VERY HIGH') and then use a regression tree? Is there any 'danger' in doing this?

For a standard regression model, how i deal with the seasonality if the new points will not be incorporated?

I'm using R, thanks!

Best Answer

Given that you have monthly data, you can model the seasonality using dummy variables, e.g.:

foo <- data.frame(sales=rnorm(48,10,10),
month=rep(c("Jan","Feb","Mar","Apr","May","Jun",
    "Jul","Aug","Sep","Oct","Nov","Dec"),4))
model <- lm(sales~month,data=foo)
predict(model,newdata=data.frame(month="Dec"))

However, a more common approach would be to use seasonal exponential smoothing. See, e.g., here.

And I think that your weather data will be completely useless: with monthly data, temperature and rainfall will be collinear with the month dummies, and weather simply varies too much within months. In particular: for a six-month ahead forecast you would need to forecast temperature and rainfall also for six months ahead, which is probably not possible to get better than "it will be June, so it will probably be warmer than today".

Related Question