Solved – Mix of categorical and continuous data in neural network

machine learningneural networkstime series

Given a shallow or deep neural network, how would one go about using both continuous numerical input features and categorical features?

For example, given a network that receives a set of 100 continuous numerical values between 0 and 1 representing monetary value, how would I also include a time component? I would suspect one would have to discretize/translate intraday hours and minutes, e.g. 21:35 into bins of say 1 hour. This would yield a one-hot vector that I would then append to my input data that flows into the network. Would this be a valid approach?

Best Answer

Barring exotic cases, NNs operate on floats. Period. Big floats, small floats, 16/32/64 bit floats - but floats all the same. So yes, you do have to encode your data somehow to floats - how you do that will depend on what you're trying to do.

Now, for time, you may have two use-cases I can think of.

First is that you're trying to account for the time of day itself, but don't care for higher-level trends (day on day, week on week, etc., you're assuming they stay the same).

In this case, you can actually get a perfectly continuous encoding just by encoding to some periodic function, like a sine where you set 0.5 as noon, or whatever.

Second is that you're doing time series. In this case you want to have a sliding window over all time periods, which is exactly what 1D Convolutions give you if the D in question is time.

Related Question