Solved – Using an RNN/LSTM to generate sequences with a unique output

lstmmachine learningneural networkspredictive-modelsstochastic-processes

I'm trying to train a LSTM recurrent neural network where my data consists of a sequence of animal migration data ((latitude, longitude), (ocean temperature at that point)) pairs. Basically what I want to do is, after training the LSTM with a bunch of these sequences, give the LSTM an initial data pair and have it predict the entire migration path sequence of these pairs.

Problem is, I don't want the LSTM to output the temperature at any point. I want the basic step to be feeding in a ((latitude, longitude), (ocean temperature at that point)) point, and having the LSTM output a (latitude, longitude). I would then pair that point with the specific temperature at that point, add it to the current sequence, and feed it back into the RNN.

So the problem is not feeding in the sequence of increasing (variable) size, but the training of the model. If I want to train the LSTM with a sequence of ((latitude, longitude), (ocean temperature at that point)) pairs, then how will the LSTM know to only output the latitude and longitude? Can I just have it output the pair with temperature and then remove it? I feel like that would be wasting a lot of computing time and my dataset is quite large.

Any advice or criticism is welcome.

Best Answer

If you like, you could do this by writing a special processing function for this gist I wrote:

https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31

import tensorflow as tf


def self_feeding_rnn(cell, seqlen, Hin, Xin, processing=tf.identity):
    '''Unroll cell by feeding output (hidden_state) of cell back into in as input.
       Outputs are passed through `processing`. It is up to the caller to ensure that the processed
       outputs have suitable shape to be input.'''
    veclen = tf.shape(Xin)[-1]
    # this will grow from [ BATCHSIZE, 0, VELCEN ] to [ BATCHSIZE, SEQLEN, VECLEN ]
    buffer = tf.TensorArray(dtype=tf.float32, size=seqlen)
    initial_state = (0, Hin, Xin, buffer)
    condition = lambda i, *_: i < seqlen
    print(initial_state)
    def do_time_step(i, state, xo, ta):
        Yt, Ht = cell(xo, state)
        Yro = processing(Yt)
        return (1+i, Ht, Yro, ta.write(i, Yro))

    _, Hout, _, final_ta = tf.while_loop(condition, do_time_step, initial_state)

    ta_stack = final_ta.stack()
    Yo = tf.reshape(ta_stack,shape=((-1, seqlen, veclen)))
    return Yo, Hout

If your code is something like:

# how your network might work:
W = tf.Variable(shape=(state_size, 3), ... )
B = tf.Variable(shape=(3,), ... )
Yo, Ho = tf.nn.dynamic_rnn( cell, input, state )
# ( lat lon temp ) 3-vectors
predictions = tf.nn.matmul(Yo, W) + B

You could use the gist as:

# using self_feeding_rnn
from magic import temperature_sampler


def process_yt(yt):
    p = tf.nn.matmul(yt, W) + B
    real_temp = temperature_sampler[p[...,0],p[...,1]]
    # remove final element (temp) and add on proper temp
    return tf.concat((p[...,:-1], real_temp), axis=-1)

Yo, Ho = self_feeding_rnn(cell, seed, initial_state, processing=process_yt)

This makes the crux of your problem getting the temperature data into a tensorflow understandable format (some sort of 2D sampler). I have no experience working with such things, but in the worst case, you can just round your lat,lon to integers and grab from a constant array (using tf.constant, not np.ndarray so that you can index with tensors).

If you are still working on this I would love to help and feel free to ask me any questions!

Related Question