This answer states that we cannot back-propagate through a random node. So, in the case of VAEs, you have the reparametrisation trick, which shifts the source of randomness to another variable different than $z$ (the latent vector), so that you can now differentiate with respect to $z$. Similarly, this question states that we cannot differentiate a random sampling operation.
Why exactly is this the case? Why is randomness a problem when differentiating and back-propagating? I think this should be made explicit and clear.
Best Answer
Gregory Gundersen wrote a blog post about this in 2018. He explictly answers the questions:
The following excerpt should answer your questions: