Solved – Algorithm to identify similar frames in a video

algorithmsimage processing

Looking at the frames of a video, we can see that many frames are almost identical. Is there any algorithm to identify these frames, so I can delete them all but one?

Best Answer

Lets assume that the frames of a video can be represented as an N x D matrix with N pixels and D features (RGB, Alpha, etc). The objective then is to devise a way of measuring similarity between such matrices, and then to set a threshold on the scalar output value from these distance methods such that matrices whose similarity exceeds the value are combined.

Since there are likely to be many frames, you may wish to use a computationally efficient distance metric. So, somewhat trivially you could treat your features, e.g. RGB, as distances along independent axis in D dimensional space. The distance along each dimension being equal to the value of that feature. From there, you could calculate the Euclidean distance between pixels from each frame in the feature space (or use cosine similarity etc, see here. Sum over the pixels to get a measure of distance between frames.

Now, this strategy is somewhat naiive as it assumes to some degree that neighbouring pixels are decorrelated, or that there isn't a better low dimensional representation of the frame from which you might calculate your distance matrices (i.e., with a M x D matrix, where M << N). Probably there are efficient state of the art methods which you could try, but best to start with something simple.

Rather than thresholding, which is somewhat unprincipled, you could cluster the data based on their distances from one another using something like k-means, and compute cluster averages as the reduced set of frames. You might also use PCA to find the direction in D dimensional space which is the most informative, and then only use that direction as your feature for computing distances.

Related Question