I'm replicating following article Financial Time Series Prediction using Deep Learning and I'm stuck with data normalization. In chapter 5.1 in the second paragraph in the last sentense the authors claims "Each input sequence was filtered by five taps long, moving uniform averaging, and then normalized by reducing the mean, and dividing by its standard deviation"
I have several questions, specifically under section [5.1]
on page 10
:
1) What do they mean by "Each input sequence was filtered by five taps long, moving uniform averaging"? I do not understand this completely
2) "…then normalized by reducing the mean, and dividing by its standard deviation". How do they normalize nonstationary trending price data? They train an ANN with SPY ETF minute prices on 2001-2013 time period and use 60 lags to predict price trends, so how do they compute mean and std? I guess they compute mean and std for each sample that is on 60 lags and then normalize each sample sequence individually. If this's the way they do it then how to normalize test data?
Best Answer
~510
minuets), dropping just about everything that doesn't conform, then chunking each day up into roughly300
units per day's trading hours (~1.7
minuets of data per sample), and retain the closing price from the last 60 minuets, all before the quoted line... context can be helpful... so if things are adding up then maybe atap
is about102
seconds, if so five of'em would be about8.5
minuets of data.However, I could be totally wrong... I've tried to parse that paragraph and can totally see how by the time the author is up-to the quoted line it's very much a dash of this or that, not to pinch too hard at their ego but considering how well referenced the surrounding text is it's totally understandable to be a bit lost with that sentence.
(a.k.a. the standard deviation of the sampling distribution of the sample mean!)
which may help in sorting out the what the authors where eluding to.As to how they are normalizing their test data, it's stated more than once in the linked to paper that they are using
raw
inputs to a network that outputs probability of movement, direction, and magnitude... Well I think that's what they where getting at... They get deeper into the pre-processing in section[3.3] Preprocessing
on page6
where this network's architecture is described in more detail, also see figures1
and2
on that page as those show how the data flows through.I may have to come back to this and take a second crack at dissecting their research, but hopefully some of this was able to gain ya some traction.