How to Understand ‘Nonlinear’ in Nonlinear Dimensionality Reduction?

dimensionality reductionmanifold-learningpattern recognitionpcaterminology

I am trying to understand the differences between the linear dimensionality reduction methods (e.g., PCA) and the nonlinear ones (e.g., Isomap).

I cannot quite understand what the (non)linearity implies in this context. I read from Wikipedia that

By comparison, if PCA (a linear dimensionality reduction algorithm) is
used to reduce this same dataset into two dimensions, the resulting
values are not so well organized. This demonstrates that the
high-dimensional vectors (each representing a letter 'A') that sample
this manifold vary in a non-linear manner.

What does

the high-dimensional vectors (each representing a letter 'A') that
sample this manifold vary in a non-linear manner.

mean? Or more broadly, how do I understand the (non)linearity in this context?

Best Answer

Dimensionality reduction means that you map each many-dimensional vector into a low-dimensional vector. In other words, you represent (replace) each many-dimensional vector by a low-dimensional vector.

Linear dimensionality reduction means that components of the low-dimensional vector are given by linear functions of the components of the corresponding high-dimensional vector. For example in case of reduction to two dimensions we have:

[x1, x2, ..., xn] ->  [f1(x1, x2, ..., xn), f2(x1, x2, ..., xn)]

If f1 and f2 are (non)linear functions, we have a (non)linear dimensionality reduction.