Categorical Data – Is Hour of Day a Categorical Variable?

categorical datacircular statistics

Is "hour of the day" where the value can be 0, 1, 2, …, 23 a categorical variable? I would be tempted to say no, since 5, for example, is 'closer' to 4 or 6 than it is to 3 or 7.

On the other hand, there is the discontinuity between 23 and 0.

So is it generally considered categorical or not? Note that 'hour' is one of the independent variables, not the variable I'm trying to predict.

Best Answer

Depending on what you want to model, hours (and many other attributes like seasons) are actually ordinal cyclic variables. In case of seasons you can consider them to be more or less categorical, and in case of hours you can model them as continuous as well.

However, using hours in your model in a form that does not take care of cyclicity for you will not be fruitful. Instead try to come up with some kind of transformation. Using hours you could use a trigonometric approach by

xhr = sin(2*pi*hr/24)
yhr = cos(2*pi*hr/24)

Thus you would instead use xhr and yhr for modelling. See this post for example: Use of circular predictors in linear regression.