Neural Networks – VAE Loss Doesn’t Converge to Zero: Sampling Instances from Trained Latent Space

autoencodersneural networksvariational

I aim to use a variational autoencoder (VAE) as a generative model. Does this make sense only if the reconstruction loss converges towards zero?

On a project I'm working on, the loss is getting reduced but at some point it won't converge to 0.

Epoch 1 of 1000
100%|██████████| 506/506 [00:07<00:00, 63.36it/s]
Train Loss: 31.2318
Epoch 2 of 1000
100%|██████████| 506/506 [00:07<00:00, 63.57it/s]
Train Loss: 19.9676
Epoch 3 of 1000
100%|██████████| 506/506 [00:08<00:00, 61.88it/s]
Train Loss: 19.0511
Epoch 4 of 1000
100%|██████████| 506/506 [00:08<00:00, 63.07it/s]
Train Loss: 18.5793
Epoch 5 of 1000
100%|██████████| 506/506 [00:08<00:00, 62.91it/s]
Train Loss: 18.1751

And it won't improve that much.

So the network is learning, maybe just not that much as it should. Despite this, does it still make sense to sample new instances from the trained latent space?

Best Answer

There's no reason to believe that the loss must converge to 0. That might suggest you're overfitting—the capacity of your model is too high, and it's memorizing training instances.

For the sake of analogy, consider a normal autoencoder whose latent dimension is smaller than the feature space. You can minimize the reconstruction error, but never get it to 0 (assuming that the data don't actually live on a lower-dimensional manifold). The auto encoder is still usable and of value.