Solved – Activation function in the mean and variance layer in VAE

autoencodersmachine learningneural networks

I have come across several different codes of vae just confused about should I apply activation function or even batch normalization to mean and variance layer?

For example, the most common is the following one without specifying the activation function in the mean layer.

However, I also looked at this following code, it applied batch norm and relu to itenter image description here

Which is correct? Many thanks!

Best Answer

Batch norm as the last layer of the encoder isn't technically wrong, but it is likely to be a bad idea (in general, never use batch norm as the last layer). And you can see in the github link referenced, that the results from that model were pretty poor due to this.

Relu for std/variance could be valid, if you decided to directly predict std/variance instead of log variance.

Relu for mean is wrong.

Of course that's not to say that the image of code you attached is wrong, since there's not enough context to tell.