Solved – Solving a scrambled image puzzle with a genetic algorithm

genetic algorithmsoptimization

I want to solve a puzzle such as this one:

scrambled image

by using a genetic algorithm. When the number of pieces grow, and maybe some are rotated, the number of combinations become overwhelming. I am hoping that a genetic algorithm will be a lot faster than an exhaustive search. As a fitness function I can detect pieces that don't fit with each other using a gradient filter.

What crossover operator would you say is suitable for this type of problem? I have previously tried simulated annealing and have observed that pieces often correctly become each other's neighbors, but are in the wrong place in the image as a whole. Is there a crossover operator that can move such correct groups?

Are there any other considerations regarding the configuration of a genetic algorithm for this problem that comes to mind?

Best Answer

Problem statement:

Given:

You have many subimages of an image. They are all squares of the same size (pixels by pixels). No connectivity is provided in the metadata or file ordering.

Assuming:

We are going to assume that when two edges match the error between them is zero, or has an absolute value of error that is sufficiently smaller than all other "matching errors" such that it can be clamped to zero.

We are also going to assume that no two non-contacting edges are identical.

We are going to assume that the images are not a trivial case. They are all uniformly the same color, for example.

We are going to assume that they are all facing the proper way, that they have not been "flipped".

Find:

The proper placement of edges such that the original (whole) image is re-assembled. Failing that, assemble as much of the whole image as possible.

My (first) Approach:

Make a hash of edges:

  • For each edge make a unique subimage and edge identifier
  • convert edge pixel intensity values into a vector
  • assemble these into a table (hash?)

Pre-processing the intensity-value-strings might improve performance:

  • treat each edge as a single vector in a higher-dimensional space
  • use PCA to center and scale intensity values
  • consider using PCA to reduce dimension if rank of eigenvalue matrix is less than its size
  • consider data whitening to move the data toward the "front" of the vectors so error compute can be truncated early if it gets too large.

Search for edges:

  • Pick a random image
  • Pick a random edge on the image
  • Get its vector from the table
  • while match error is not zero, and number of already-tried-matches-this-picking-of-this-edge is less than half the available set:
    • pick a random, unmatched edge of a unfitted sub-image
    • measure value of pixel-by-pixel intensity difference, store
  • sort hash by value of stored error measure from smallest magnitude to largest
  • pick top edge, not this edge off of hash
  • while match error is not zero, and number of already-tried-matches-this-picking-of-this-edge is less than half the available set:
    • pick a random, unmatched edge of a unfitted sub-image
    • measure value of pixel-by-pixel intensity difference ...
  • we might want to uniformly sub-sample the remaining un-matched hash elements at this point.

Thoughts:

  • This quit early will keep outer perimeter edges for trying to match when they don't have a match. It will also reduce the search space by half. I don't know if it is proper to call this "adding a poor mans greedy to the algorithm".
  • The pair smaller errors every other trip through is likely to put similar-looking edges next to each other and try to match them up. This should improve the likelihood of matching quickly using already computed data.
  • Data whitening and premature exit of comparison when error is above a certain value can also accelerate the process by 10x or better. PCA and dimension reduction can also do that. Mileage is going to depend on the perimeter-cells to total cells ratio of your subimages and the entropy in edge intensity values.
  • If the data is non-pristine, so that no matching edges have nonzero error, then an approach like this might assemble more of the image in less time than an exhaustive-search method.

As someone who has played many delightful games of boxes and dots with my younger siblings, cousins, nephews and nieces my intuition tells me that there are at least two parts of the game. The first is where you partition the general space into sub-blocks that are discontiguous in a way that gives enough choice that you can force your opponent into low-capture-size partitions while taking high capture-size partitions. The second is in actually taking those partitions and capturing as many individual boxes. This problem is that game in reverse. After some portion of the edge-matching process is complete, say 60% of the total number of edges, has been set then the transition to "endgame" can occur. In relation to "puzzle" metaphor - it is where you have big enough chunks and you look to tie them together. Majoring on connecting edge pieces will allow you to combine networks, and in one or two edge-matches can lock down many remaining image locations with regards to that first one. After the nodes are "connected enough" in terms of random subnets it becomes prudent to look only at net-edges and try to relate them to other net-edges.