[Math] Approximating a decimal with a fraction (32-bit fixed point to two 23-bit numbers). Think binary, ease of computation.

computational mathematicsfractionsnumber theory

I have a 32-bit fixed-width decimal number between 0 and 1.0 (actually its guaranteed to be between 0.001 and 0.02, so loss of range is acceptable in the approximation). The binary representation is defined as most-significant-bit representing 1/2, bit after that 1/4, and so on to 32 bits.

I must represent this number as a ratio of two 23-bit numbers (actually one of them is 23 and one is 18 bits but let's simplify for the sake of discussion). It does not matter what the 2 numbers are as long as each of them fits in 23 bits.

Another way to think about this problem is I have a fraction with a 32-bit numerator and a known constant denominator of 2^32. I must approximate this fraction as another fraction with 23-bit numerator and 23-bit denominator, such that an error of no more than 1/(2^32) is acceptable.

Is there a reasonably efficient algorithm that would allow me to do this without doing a brute force guess-and-check? Some form of "bit twiddling" trick would be ideal as this must be done on a fairly small microcontroller. Barring that, the next best thing would be integer arithmetic.

I understand that since I am repacking 32 bits of information into a total 46 bits of information, theoretically it is possible. A brute force solution of guessing and checking until some pair of number works is pretty trivial. What I am looking for is some more efficient algorithm or at least a way of figuring out a good starting point for guess-and-check. Ideally, there would be a defined time bound for the execution of this algorithm.

For example, my input (A) is 0.005777551792562008. If we take the binary representation of this

0b 0000 0001 0111 1010 1010 0011 0011 1100

, it can be reinterpreted as the 32-bit unsigned integer 24,814,396.

This number can be represented as the ratio (N/D) $$ 0.005777551792562008 = \frac{24814396}{2^{32}} \approx \frac{41572}{7195435} = 0.005777552017355449 $$.

In this case, the error of the approximation is 2.248e-10 which is smaller than 1/(2^32)=2.328e-10 .

The situation described in this question is pretty specific because it's based on a real world scenario. However, please feel free to generalize, even if your generalization makes the answer not fit the specific situation I describe.

The point of this question is to get some ideas about where to start and hopefully leave something that is applicable in more situations. Computing a "close enough" fractional approximation of a decimal is a pretty general problem, but one I have not seen addressed specifically in the context of computers and binary representation.

Best Answer

Finding rational approximations to a number is a field known as Diophantine Approximation. In your case, you are trying to find the most accurate approximation within some limits on the values of the numerator and the denominator. It just so happens that you can find the best rational approximation to a number by using its Continued Fraction representation.

Algorithm

Take the integer part of the number and write it down.
Divide $1$ by the fractional part of the number.
Repeat steps $1$ and $2$ until the necessary precision is achieved.

I will give a quick example of finding the continued fraction for $\pi$.

$$\begin{align} \pi&=3+0.14159...\\ &=3+\frac1{7+0.06251...}\\ &=3+\frac1{7+\frac1{15+0.99659}}\\ \pi&\approx3+\frac1{7+\frac1{15+\frac11}}\\ &\approx\frac{355}{113} \end{align}$$

$\frac{355}{113}$ is the best rational approximation of $\pi$ with a denominator less than $113$. It is accurate up to fifteen digits. Generally, the error between the approximation and the number is $$\left|n-\frac pq\right|<\frac1{q^2}$$

This means that as long as you can get a denominator greater than $2^{16}=65536$, you’re guaranteed to get a result within the bounds you gave above.

Notes

Continued fractions are often written in the notation $$\pi\approx[3;7,15,1]= 3+\frac1{7+\frac1{15+\frac11}}$$ where the integer part is the first number and all the other numbers go in the denominator.

In your specific example of $0.005777551792562008$, the continued fraction of your approximation looks like

while the continued fraction approximation looks like

Notice how similar the two continued fractions are, with the first five terms being identical. However, the continued fraction approximation within the range you used is still more accurate, with your approximation having an error greater than the continued fraction error by $2.247823\cdot10^{-10}$.

Best Answer

Related Solutions

[Math] Algorithms to compute the class number

[Math] Converting a negative decimal with fractions to binary and hex

Related Question