Solved – Implementing fancy PCA augmentaiton

deep learningeigenvaluesimage processingpca

I am really struggling to implement this fancy PCA augmentation method described in this paper, here is what I believe I must do (correct me if I am wrong):

1) Create a Matrix where the first column contains all the red pixel data, the 2n column all the green pixel data and the 3rd all the blue pixel data from all the images in the dataset.

2) Calculate the mean of every column and subtract it from every respective column.

3) Normalise the data between 0 and 1? (is this necessary? since all values are already between 0 and 255)

4) Apply PCA, i.e. create covariance matrix and compute the 3 eigenvectors and eigenvalues.

5) Then add eigenVec1 * a1 * eigenVal1 + eigenVec2 * a2 * eigenVal2 + eigenVec3 * a3 * eigenVal3 to each rgb channel in every image; Where 'a' is sampled from a gaussian with 0 mean and 0.1 std.

But it seems like from this code
colour_channel_weights = np.array([-0.0148366, -0.01253134, -0.01040762], dtype='float32') from link That the colour channel weights are very small and multiplying them by a random number less than 1 will make them even smaller. So wouldn't the overall effect of the augmentation have a super slim effect on the original data (like perpetuate is a miniscule amount of less than 1%)?

Am I on the right track here?

Best Answer

1-3) It depends on how you normalize your image dataset. I guess, you will want to use the same normalization for the PCA.

4) Correct.

5) Correct. However, you could experiment with different std's to control the magnitude of the perturbation. Also note that the $a_j$'s are only sampled once for transforming a given image!

Whatever the colour_channel_weights are, they are definitely not the Eigenvalues. Indeed, the Eigenvalues of a covariance matrix are always non negative!

To get a feeling for the magnitude of the perturbations constructed in this way, assume for a moment that the colour channels are independently distributed (over the training data). In this case, the Eigenvectors and Eigenvalues are just the RBG-channels and their respective variances. The perturbations $p_R, p_B, p_G$ of the red,blue and green channel respectively, are then given by \begin{align} p_R&=a_1 \cdot(\text{Variance of red channel}), \\ p_B&=a_2 \cdot(\text{Variance of blue channel}), \\ p_G&=a_3 \cdot(\text{Variance of green channel}). \end{align}