I've been reading this article about implementing a VAE with normalizing flows. What it's not clear to me, is which parameters are actually optimized using this approach. Should I only optimize the parameters of the flow part and not compute gradients of the loss function with respect to the weights of the encoder and decoder? If yes, why? How are the encoder and decoder used in this context?
Autoencoders and Variational Bayes – Which Parameters Are Updated in VAE with Normalizing Flow?
autoencodersnormalizing-flowvariational-bayes
Best Answer
You optimize the loss with respect to $\theta$ and $\phi$—which includes the parameters of the decoder, the encoder, and the flow model.
The source code in the blog post you've linked to answers the question. But so would a more academic reference on variational auto-encoders, like the Rezende and Mohamed paper that the blog post cites. In the future, those might be a better reference.