Using ANNs
If the inputs are all of the same length, you don't need a RNN. You can just simply train a ANN with with fixed input length. That would be easier and faster.
If you fit the ANN with (many) samples weight, height, past performance at time $t_i$, and then later at time $t_{i+1}$ the data of a given horse $h_k$ changes, you can just run the new data through the ANN and get a prediction. This is possible because the structure of the data is still the same, i.e. weight, height, etc.
Using RNNS
An RNN assumes a time series as input and the data of a horse is not a time series, e.g. there is no temporal relation between the weight and past performance. There could however be a causal relation.
You could however train it on the result list as it is a time series. The input could then e.g. be the past $n$ performances of a given horse $h_i$ and you want to know how it may perform in the next race:
$\text{input} = \{t_0, t_1, t_2, t_3, ..., t_{n-1}, t_n\}$
$\text{output} = \{t_{n+1}\}$
Getting multidimensional output
So for your example
INPUT(race.length, race.condition, ..., horse1, horse2, horse3)
OUTPUT(horse1.time, horse2.time, horse3.time)
i don't think you'd have to change much. The ANN now has 3 output nodes and you train it similarly to the one from above. If you have a set of input and output pairs you can just use those for training.
Combining ANN and RNN
You could also combine both methods. First you'd train the ANN on the horse data and the RNN on the performance data.
Then you could e.g. add the output of your RNN (prediction of the next performance based on the past performances) as input of the ANN together with the current data of the horse.
I also had this question before. On a higher level, in (samples, time steps, features)
samples
are the number of data, or say how many rows are there in your data set
time step
is the number of times to feed in the model or LSTM
features
is the number of columns of each sample
For me, I think a better example to understand it is that in NLP
, suppose you have a sentence to process, then here sample is 1, which means 1 sentence to read, time step
is the number of words in that sentence, you feed in the sentence word by word before the model read all the words and get a whole context of that sentence, features
here is the dimension of each word, because in word embedding like word2vec
or glove
, each word is interpreted by a vector with multiple dimensions.
The input_shape
parameter in Keras
is only (time_steps, num_features)
,
more you can refer to this.
That's basically how I understand this, hope make it clear for you.
Best Answer
As you have mentioned, RNN's and LSTM's are meant to capture time dependency in time-series data. Thus, feeding in an input with only one time-step does not make sense. (Unless one is using Stateful LSTM which has a different story).
Here is an example: We have a product and we want to forecast its sales from historical data. We can then choose number of time steps based on which we want to make a prediction, for instance, given 7 days of sales, predict the sales of the 8th days. Thus, the input would be of shape
(N, ts, 1)
and output would be of shape(N, 1)
. N is the total number of samples you have (so each sample has 7 days of sales as input and the sales of the next day as output).I am not sure which tutorials you are referring to, but this one might have a better example.