(EDIT: I've completely rewritten this first algorithm for clarity)
If an approximate answer is good enough, I think there is a simple algorithm. First, consider the exact version of your problem: ExactMix(C1, C2) = 0.9 * C1 + 0.1 * C2
.
We can represent any color as a unique linear combination of the basic colors:
target = a * color1 + b * color2 + c * color3
A slight modification that will simplify things is if we change from (R,G,B)
coordinates to (x,y,z)
coordinates by
(R,G,B) = x * color1 + y * color2 + (1 + z - x - y) * color3
In (x,y,z)
coordinates, we have
color1 = (1,0,0)
color2 = (0,1,0)
color3 = (0,0,0)
and ExactMix
has the same form in (x,y,z)
coordinates as it does in (R,G,B)
-coordinates.
The thing to note is that given any three colors in the z=0
plane, you can use ExactMix
to form arbitrarily good approximations to any color within the triangle they define. The algorithm is essentially a binary search: if the color T
lies in the triangle PQR
, then let S=ExactMix(P,Q)
. Then either T
lies in SQR
or it lies in PSR
.
By iterating this algorithm (and rotating which edge we split with ExactMix
), we get an infinite sequence of triangles whose corners converge to T
.
If we are so inclined, we might consider splitting the segment PQ
into three segments: P-ExactMix(P,Q)
, ExactMix(P,Q)-ExactMix(Q,P)
, and ExactMix(Q,P)-Q
rather than just two segments, and choose which of the three triangles T
lies within.
Now, Mix
introduces round-off error. We can still use the same binary search algorithm to get the (x,y)
coordinates approximately right, but it will also spontaneously decrement the z
coordinate as well.
If our target z
coordinate is very negative, we can use the same binary search algorithm to get the (x,y)
coordinates right, then keep doing calculations until round-off error gives us the correct z
coordinate. (this might take some cleverness to avoid getting stuck in a situation where there is no round-off error)
If our target z
coordinate is nearly zero, we can still run the algorithm and hope for the best, getting something that is at least close to our target color.
When implementing the inexact version, it is probably best to redo the coordinate change every time you pick a new triangle. That is, rather than always using the (x,y,z)
coordinates defined by the initial three colors, you instead compute a new set of (x,y,z)
coordinates based off of your current set of three colors.
For an exact answer, I expect exhaustive brute force to be feasible if done efficiently: a one-time calculation to compute every color you can possibly produce.
The natural data structure to hold the results would be to keep a structure that stores, for every color, one of the following three factoids:
- I do not know how to make this color
- This is a basic color
- This color is produced by mix(color1, color2)
Then keep a second list of every color ever produced. Initialize it with your three basic colors, then execute the following algorithm:
- i = 0
- while i < length(array of generated colors)
- for j = 0 to i
- c = mix(color[i], color[j])
- if c is marked as "I don't know how to make":
- Append c to the list of generated colors
- Mark c as produced by mix(color[i], color[j])
- for j = 0 to i-1
- c = mix(color[j], color[i])
- if c is marked as "I don't know how to make":
- Append c to the list of generated colors
- Mark c as produced by mix(color[j], color[i])
Efficient computation of the mix
function is essential. You can probably get some advantage from the fact that your inner loop is over j
with i
fixed (and the compiler might recognize that for you). Also, you will need an efficient data structure for the main data. For example, you could store the data as follows:
- A
256x256x256
array of pair<color,color>
- A
256x256x256
bit array to represent which entries are "do not know how to make"
- A list of basic colors
The point of the bit array is that it will fit in cache (or nearly so), so that accessing it should be fast. Note that in the above algorithm you don't need the list of basic colors, so you could completely omit that part for your calculation.
It may or may not be slightly better to replace the first array with an array of 'pointers', and keep a second, flat array around of all of the pairs ever stored. This should result in a more compact data structure, since a large fraction of colors will never be produced, and may be faster to create. (less cache pollution)
If every color was reachable, then this algorithm would require 2^48
steps. Even if you could get the time of the inner loop down to a dozen ticks on a 1 GHz computer, it would take 36 days.
However, a large fraction of the colors are not reachable from the basic colors, and the algorithm designed above is meant to calculate only the entries that are reachable. I don't have the tools handy to estimate the fraction easily, but I fully expect the calculation to take only a few days with a very well optimized inner loop. A C/C++ implementation (or Fortran, I suppose) is probably a requirement for this calculation (unless it happens to be even smaller than I expect it to be).
Also, you could parallelize the algorithm, so as to use multiple cores on your CPU or multiple computers. It will be a little tricky to do so, as there are two obstacles:
- You have to ensure that you iterate over all pairs of generated colors: so you have to combine the lists produced by different threads intelligently
- You have to devise a scheme of having thread work independently for a while then synchronize results -- if multiple threads tried to update a shared data structure, you will get killed by the synchronization costs. (or get corrupt data if you forego synchronization)
Having read the linked paper, I now understand a lot more of what you are doing here than what you have explained. And I think I can help solve your problem.
The authors perform the conversion from RYB to RGB via a trilinear interpolation. In essence, they provide explicit values of a mapping from RYB to RGB on the corners of an RYB cube—i.e. every point $(r,y,b)$ where $r$, $y$, and $b$ are all $1$ or $0$—and they linearly interpolate along the three axes everywhere else.
Let us call that mapping $f: \textrm{RYB} \rightarrow \textrm{RGB}$, where $f$ takes an $(r,y,b)$ triplet to an $(R,G,B)$ triplet. (I will use lowercase for RYB and uppercase for RGB throughout.) The interpolation is defined by the following facts:
$$\begin{align}
\textrm{RYB}&\rightarrow\textrm{RGB}\\
f(0,0,0)&=(1,1,1)\\
f(0,0,1)&=(0.163, 0.373, 0.6)\\
f(0,1,0)&=(1,1,0)\\
f(0,1,1)&=(0, 0.66, 0.2)\\
f(1,0,0)&=(1,0,0)\\
f(1,0,1)&=(.5,.5,0)\\
f(1,1,0)&=(1,.5,0)\\
f(1,1,1)&=(0.2, 0.094, 0.0)\\
f(r,y,b)&=f(0,0,0)(1-r)(1-y)(1-b)+f(0,0,1)(1-r)(1-y)b\\
&\;+f(0,1,0)(1-r)y(1-b)+f(1,0,0)r(1-y)(1-b)\\
&\;+f(0,1,1)(1-r)yb+f(1,0,1)r(1-y)b\\
&\;+f(1,1,0)ry(1-b)+f(1,1,1)ryb
\end{align}$$
where the subscript $c$ denotes a value at the corners.
You now want to solve the opposite problem. You want a function $f^{-1}: \textrm{RGB} \rightarrow \textrm{RYB}$ which takes a triplet $(R,G,B)$ to a triplet $(r,y,b)$. It seems to me that an easier problem to solve is to go through the same process as the authors of the linked paper did: find the RYB values of all the colors at the corners of an RGB cube and interpolate between them. This will give you a function $F: \textrm{RGB} \rightarrow \textrm{RYB}$ which might not be exactly equal to $f^{-1}$ but will hopefully be close enough for what you need.
$$\begin{align}
\textrm{RGB}&\rightarrow\textrm{RYB}\\
F(0,0,0)&=?\\
F(0,0,1)&=?\\
F(0,1,0)&=?\\
F(0,1,1)&=?\\
F(1,0,0)&=(1,0,0)\\
F(1,0,1)&=?\\
F(1,1,0)&=(0,1,0)\\
F(1,1,1)&=(0,0,0)\\
F(R,G,B)&=F(0,0,0)(1-R)(1-G)(1-B)+F(0,0,1)(1-R)(1-G)B\\
&\;+F(0,1,0)(1-R)G(1-B)+F(1,0,0)R(1-G)(1-B)\\
&\;+F(0,1,1)(1-R)GB+F(1,0,1)R(1-G)B\\
&\;+F(1,1,0)RG(1-B)+F(1,1,1)RGB.
\end{align}$$
The task that remains is to fill in those blanks. I would recommend coding up $f$ in something like Mathematica and finding the values $(r,y,b)$ where $f(r,y,b)=(R_c,G_c,B_c)$ for the values at the corners of the RGB cube. Then set $F(R_c,G_c,B_c)=(r,y,b)$ for the values you just found. Good luck!
Best Answer
To plot the histograms, try something like this:
To estimate the probability that a pixel is within some range of values, find the number pixels that are within that range and divide them by the total number of pixels:
If you want to change the mean of the pixels, I'd first just bias them and clip any values under 0 or over 255. If that's good enough, then you're done. Otherwise, you'll have to come up with some method for taking the clipping into account. To change the variance, just scale the pixels rather than bias them. There may be a better way to do this that I am not aware of.
In the above, the
vec()
function would be defined by