Solved – Explaining the distributions in Tensorflow’s Tensorboard

backpropagationdeep learningganmachine learningtensorflow

I'm trying to train a cGAN network on Tensorflow and have all the summaries of the Discriminator, but I'm having difficulty understanding what they mean…

There are currently 5 layers in the Discriminator and the distributions are like below:

Fifth Layer
Fourth Layer
Third Layer
Second Layer
First Layer

I'm using the LSGAN loss for training the network and the 5 Conv layers are activated by a LeakyReLU function, but I feel as if the gradients of Layer 5 aren't propagated back properly and it's as if the gradients are diminishing…

Do the images show this or am I mistaking the distributions for something else?

Best Answer

Your gradient is very strong in all your layers for your first steps. It seems that your model is learning very fast first (normal with gradient learning), and then only the last layer continue to learn to be able to fix the last issues of your model.

So if your model doesn't learn, you have an issue of vanishing gradient (conv too deep or gradient step too low). Or maybe an issue with your dataset if he is able to learn all (overfitting) in some steps.

Maybe check your batch too, if your batch is too big it will average too much your weights (and your gradients won't change after the first steps)

I am still new in deep learning, but it seems to be because of these reasons, according to my low knowledge yet.

Related Question