Solved – Architecture of autoencoders

autoencoders

Ordinary autoencoder architectures (not variational autoencoders, stacked denoising autoencoders, etc.) seem to only have three layers: the input, the hidden/code, and the output/reconstruction. Are there any examples of papers which used architectures consisting of multiple hidden layers? If not, what are the theoretical justifications for only using one hidden layer in an autoencoder?

Best Answer

Are there any examples of papers which used architectures consisting of multiple hidden layers?

Yes, e.g. look for "deep autoencoders" a.k.a. "stacked autoencoders", such as {1}:

enter image description here

Hugo Larochelle has the video on it: Neural networks [7.6] : Deep learning - deep autoencoder

Geoffrey Hinton also has a video on it: Lecture 15.2 — Deep autoencoders [Neural Networks for Machine Learning]


Examples of deep autoencoders which don't make use of pretraining: http://ufldl.stanford.edu/wiki/index.php/Stacked_Autoencoders

A good way to obtain good parameters for a stacked autoencoder is to use greedy layer-wise training.

E.g., {2} uses a stacked autoencoder with greedy layer-wise training.

Note that one can use autoencoders fancier than feedforward fully connected neural networks, e.g. {3}.


References: